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“The author has spared himself no pains in his endeavour to 
present the main ideas in the simplest and most intelligible form, 
and on the whole, in the sequence and connection in which they 
actually originated. In the interest of clearness, it appeared to 
me inevitable that I should repeat myself frequently, without pay- 
ing the slightest attention to the elegance of the presentation. I 
adhered scrupulously to the precept of that brilliant theoretical 
physicist L. Boltzmann, according to whom matters of elegance 
ought be left to the tailor and to the cobbler.” 

Albert Einstein, in Relativity, the Special and General Theory , 
(1961), p. v 
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Preface 


Learning physics is hard. Part of the problem is that physics is 
naturally expressed in mathematical language. When we teach 
we use the language of mathematics in the same way that we 
use our natural language. We depend upon a vast amount of 
shared knowledge and culture, and we only sketch an idea using 
mathematical idioms. We are insufficiently precise to convey an 
idea to a person who does not share our culture. Our problem 
is that since we share the culture we find it difficult to notice 
that what we say is too imprecise to be clearly understood by a 
student new to the subject. A student must simultaneously learn 
the mathematical language and the content that is expressed in 
that language. This is like trying to read Les Miserables while 
struggling with French grammar. 

This book is an effort to ameliorate this problem for learn- 
ing the differential geometry needed as a foundation for a deep 
understanding of general relativity or quantum field theory. Our 
approach differs from the traditional one in several ways. Our cov- 
erage is unusual. We do not prove the general Stokes’s Theorem — 
this is well covered in many other books — instead, we show how it 
works in two dimensions. Because our target is relativity, we put 
lots of emphasis on the development of the covariant derivative, 
and we erect a common context for understanding both the Lie 
derivative and the covariant derivative. Most treatments of differ- 
ential geometry aimed at relativity assume that there is a metric 
(or pseudometric). By contrast, we develop as much material as 
possible independent of the assumption of a metric. This allows 
us to see what results depend on the metric when we introduce 
it. We also try to avoid the use of traditional index notation for 
tensors. Although one can become very adept at “index gymnas- 
tics,” that leads to much mindless (though useful) manipulation 
without much thought to meaning. Instead, we use a semantically 
richer language of vector fields and differential forms. 

But the single biggest difference between our treatment and 
others is that we integrate computer programming into our expla- 
nations. By programming a computer to interpret our formulas 
we soon learn whether or not a formula is correct. If a formula 
is not clear, it will not be interpretable. If it is wrong, we will 
get a wrong answer. In either case we are led to improve our 
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program and as a result improve our understanding. We have 
been teaching advanced classical mechanics at MIT for many years 
using this strategy. We use precise functional notation and we 
have students program in a functional language. The students 
enjoy this approach and we have learned a lot ourselves. It is the 
experience of writing software for expressing the mathematical 
content and the insights that we gain from doing it that we feel is 
revolutionary. We want others to have a similar experience. 
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Prologue 

Programming and Understanding 

One way to become aware of the precision required to unam- 
biguously communicate a mathematical idea is to program it for 
a computer. Rather than using canned programs purely as an 
aid to visualization or numerical computation, we use computer 
programming in a functional style to encourage clear thinking. 
Programming forces us to be precise and unambiguous, without 
forcing us to be excessively rigorous. The computer does not toler- 
ate vague descriptions or incomplete constructions. Thus the act 
of programming makes us keenly aware of our errors of reasoning 
or unsupported conclusions . 1 

Although this book is about differential geometry, we can show 
how thinking about programming can help in understanding in a 
more elementary context. The traditional use of Leibniz’s notation 
and Newton’s notation is convenient in simple situations, but in 
more complicated situations it can be a serious handicap to clear 
reasoning. 

A mechanical system is described by a Lagrangian function of 
the system state (time, coordinates, and velocities). A motion of 
the system is described by a path that gives the coordinates for 
each moment of time. A path is allowed if and only if it satisfies 
the Lagrange equations. Traditionally, the Lagrange equations are 
written 

d dL dL 
dt dq dq 

What could this expression possibly mean? 

Let’s try to write a program that implements Lagrange equa- 
tions. What are Lagrange equations for? Our program must take 
a proposed path and give a result that allows us to decide if the 
path is allowed. This is already a problem; the equation shown 
above does not have a slot for a path to be tested. 


lr Oie idea of using computer programming to develop skills of clear thinking 
was originally advocated by Seymour Papert. An extensive discussion of this 
idea, applied to the education of young children, can be found in Papert [13]. 
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So we have to figure out how to insert the path to be tested. 
The partial derivatives do not depend on the path; they are deriva- 
tives of the Lagrangian function and thus they are functions with 
the same arguments as the Lagrangian. But the time derivative 
d/dt makes sense only for a function of time. Thus we must 
be intending to substitute the path (a function of time) and its 
derivative (also a function of time) into the coordinate and velocity 
arguments of the partial derivative functions. 

So probably we meant something like the following (assume 
that w is a path through the coordinate configuration space, and 
so w(t) specifies the configuration coordinates at time t): 


d 

^ dL(t,q,q) 

\ 

dL(t, q, q) 

dt 

dq 

q = w(t ) 

■ _ dw(t) ) 

q — dt ' 

dq 


q = w(t) 
■ _ dw(t) 

y — dt 


= 0 . 


In this equation we see that the partial derivatives of the La- 
grangian function are taken, then the path and its derivative 
are substituted for the position and velocity arguments of the 
Lagrangian, resulting in an expression in terms of the time. 

This equation is complete. It has meaning independent of the 
context and there is nothing left to the imagination. The earlier 
equations require the reader to fill in lots of detail that is implicit 
in the context. They do not have a clear meaning independent of 
the context. 

By thinking computationally we have reformulated the La- 
grange equations into a form that is explicit enough to specify 
a computation. We could convert it into a program for any sym- 
bolic manipulation program because it tells us how to manipulate 
expressions to compute the residuals of Lagrange’s equations for 
a purported solution path . 2 


2 The residuals of equations are the expressions whose value must be zero if 
the equations are satisfied. For example, if we know that for an unknown x, 
x 3 — x = 0 then the residual is x 3 — x. We can try x = — 1 and find a residual 
of 0, indicating that our purported solution satisfies the equation. A residual 

may provide information. For example, if we have the differential equation 
df(x)/dx — af(x) = 0 and we plug in a test solution f(x) = Ae bx we obtain 
the residual ( b — a)Ae bx , which can be zero only if b = a. 
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Functional Abstraction 

But this corrected use of Leibniz notation is ugly. We had to 
introduce extraneous symbols (q and q) in order to indicate the ar- 
gument position specifying the partial derivative. Nothing would 
change here if we replaced q and q by a and b . 3 We can sim- 
plify the notation by admitting that the partial derivatives of the 
Lagrangian are themselves new functions, and by specifying the 
particular partial derivative by the position of the argument that 
is varied 

2 L)(t,w(t), ^w(t))) - (<9i L)(t,w(t), j t w{t)) = 0, 

where d{L is the function which is the partial derivative of the 
function L with respect to the ith argument. 4 

Two different notions of derivative appear in this expression. 
The functions 82 L and 8\L , constructed from the Lagrangian 
L, have the same arguments as L. The derivative d/dt is an 
expression derivative. It applies to an expression that involves 
the variable t and it gives the rate of change of the value of the 
expression as the value of the variable t is varied. 

These are both useful interpretations of the idea of a derivative. 
But functions give us more power. There are many equivalent 
ways to write expressions that compute the same value. For 
example l/(l/?’i + l/r2) = (nr2)/(ri + r2). These expressions 
compute the same function of the two variables r\ and r'2 . The 
first expression fails if r\ = 0 but the second one gives the right 
value of the function. If we abstract the function, say as n(r*i , 7*2) , 
we can ignore the details of how it is computed. The ideas become 
clearer because they do not depend on the detailed shape of the 
expressions. 


3 That the symbols q and q can be replaced by other arbitrarily chosen non- 
conflicting symbols without changing the meaning of the expression tells us 
that the partial derivative symbol is a logical quantifier, like forall and exists 
(V and 3). 

4 The argument positions of the Lagrangian are indicated by indices starting 
with zero for the time argument. 
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So let’s get rid of the expression derivative d/dt and replace it 
with an appropriate functional derivative. If / is a function then 
we will write Df as the new function that is the derivative of /: 5 

W)W = £/mL. 

To do this for the Lagrange equation we need to construct a 
function to take the derivative of. 

Given a configuration-space path w, there is a standard way 
to make the state-space path. We can abstract this method as a 
mathematical function T: 

T[w}{t) = ( t,w(t ), ^w(t)). 

Using r we can write: 

j t ((a 2 L)(T[w\m - (ftLXrMM) = o. 

If we now define composition of functions (/ o g){x) = f(g(x )), 
we can express the Lagrange equations entirely in terms of func- 
tions: 

D((d 2 L) o (r [«;])) - (dtL) o (rH) = 0. 

The functions 8\L and d 2 L are partial derivatives of the func- 
tion L. Composition with T[u>] evaluates these partials with coor- 
dinates and velocites appropriate for the path w, making functions 
of time. Applying D takes the time derivative. The Lagrange 
equation states that the difference of the resulting functions of 
time must be zero. This statement of the Lagrange equation is 
complete, unambiguous, and functional. It is not encumbered 
with the particular choices made in expressing the Lagrangian. 
For example, it doesn’t matter if the time is named t or r, and it 
has an explicit place for the path to be tested. 

This expression is equivalent to a computer program: 6 


5 An explanation of functional derivatives is in Appendix B, page 202. 

6 The programs in this book are written in Scheme, a dialect of Lisp. The 
details of the language are not germane to the points being made. What is 
important is that it is mechanically interpretable, and thus unambiguous. In 
this book we require that the mathematical expressions be explicit enough 
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(define ( (Lagrange-equations Lagrangian) w) 

(- (D (compose ((partial 2) Lagrangian) (Gamma w) ) ) 

(compose ((partial 1) Lagrangian) (Gamma w)))) 

In the Lagrange equations procedure the parameter Lagrangian 
is a procedure that implements the Lagrangian. The derivatives 
of the Lagrangian, for example ((partial 2) Lagrangian), are 
also procedures. The state-space path procedure (Gamma w) is 
constructed from the configuration-space path procedure w by the 
procedure Gamma: 

(define ((Gamma w) t) 

(up t (w t) ((D w) t))) 

where up is a constructor for a data structure that represents a 
state of the dynamical system (time, coordinates, velocities). 

The result of applying the Lagrange-equations procedure to 
a procedure Lagrangian that implements a Lagrangian function 
is a procedure that takes a configuration-space path procedure w 
and returns a procedure that gives the residual of the Lagrange 
equations for that path at a time. 

For example, consider the harmonic oscillator, with Lagrangian 

L(t,q,v ) = ^mv 2 — \kq 2 , 

for mass m and spring constant k. This Lagrangian is imple- 
mented by 

(define ((L-harmonic m k) local) 

(let ((q (coordinate local)) 

(v (velocity local))) 

(- (* 1/2 m (square v) ) 

(* 1/2 k (square q) ) ) ) ) 

We know that the motion of a harmonic oscillator is a sinusoid 
with a given amplitude a, frequency u and phase <p: 

x(t ) = a cos (cut + (p). 


that they can be expressed as computer programs. Scheme is chosen because 
it is easy to write programs that manipulate representations of mathematical 
functions. An informal description of Scheme can be found in Appendix A. 
The use of Scheme to represent mathematical objects can be found in Ap- 
pendix B. A formal description of Scheme can be obtained in [10]. You can 
get the software from [21], 
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Suppose we have forgotten how the constants in the solution relate 
to the physical parameters of the oscillator. Let’s plug in the 
proposed solution and look at the residual: 

(define (proposed-solution t) 

(* ’a (cos (+ (* ’omega t) ’phi)))) 

(show-expression 

( ( (Lagrange-equations (L-harmonic ’m ’k)) 
proposed-solution) 

’t)) 


cos (ut + ip) a (k — mu 2 ) 


The residual here shows that for nonzero amplitude, the only 
solutions allowed are ones where (k — mu 2 ) = 0 or u = sjkfm. 

But, suppose we had no idea what the solution looks like. We 
could propose a literal function for the path: 

(show-expression 

(((Lagrange-equations (L-harmonic ’m ’k)) 

(literal-function ’x)) 

’t)) 


kx ( t ) + mD 2 x ( t ) 


If this residual is zero we have the Lagrange equation for the 
harmonic oscillator. 

Note that we can flexibly manipulate representations of math- 
ematical functions. (See Appendices A and B.) 

We started out thinking that the original statement of La- 
grange’s equations accurately captured the idea. But we really 
don’t know until we try to teach it to a naive student. If the 
student is sufficiently ignorant, but is willing to ask questions, we 
are led to clarify the equations in the way that we did. There 
is no dumber but more insistent student than a computer. A 
computer will absolutely refuse to accept a partial statement, with 
missing parameters or a type error. In fact, the original statement 
of Lagrange’s equations contained an obvious type error: the 
Lagrangian is a function of multiple variables, but the d/dt is 
applicable only to functions of one variable. 
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Introduction 


Philosophy is written in that great book which 
ever lies before our eyes — I mean the 
Universe — but we cannot understand it if we do 
not learn the language and grasp the symbols in 
which it is written. This book is written in the 
mathematical language, and the symbols are 
triangles, circles, and other geometrical figures 
without whose help it is impossible to comprehend 
a single word of it, without which one wanders in 
vain through a dark labyrinth. 

Galileo Galilei [8] 


Differential geometry is a mathematical language that can be used 
to express physical concepts. In this introduction we show a typ- 
ical use of this language. Do not panic! At this point we do not 
expect you to understand the details of what we are showing. All 
will be explained as needed in the text. The purpose is to get the 
flavor of this material. 

At the North Pole inscribe a line in the ice perpendicular to 
the Greenwich Meridian. Hold a stick parallel to that line and 
walk down the Greenwich Meridian keeping the stick parallel to 
itself as you walk. (The phrase “parallel to itself’' is a way of 
saying that as you walk you keep its orientation unchanged. The 
stick will be aligned East- West, perpendicular to your direction of 
travel.) When you get to the Equator the stick will be parallel to 
the Equator. Turn East, and walk along the Equator, keeping the 
stick parallel to the Equator. Continue walking until you get to 
the 90°E meridian. When you reach the 90°E meridian turn North 
and walk back to the North Pole keeping the stick parallel to itself. 
Note that the stick is perpendicular to your direction of travel. 
When you get to the Pole note that the stick is perpendicular to 
the line you inscribed in the ice. But you started with that stick 
parallel to that line and you kept the stick pointing in the same 
direction on the Earth throughout your walk — how did it change 
orientation? 
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The answer is that you walked a closed loop on a curved sur- 
face. As seen in three dimensions the stick was actually turning as 
you walked along the Equator, because you always kept the stick 
parallel to the curving surface of the Earth. But as a denizen of 
a 2-dinrensional surface, it seemed to you that you kept the stick 
parallel to itself as you walked, even when making a turn. Even 
if you had no idea that the surface of the Earth was embedded in 
a 3-dimensional space you could use this experiment to conclude 
that the Earth was not flat. This is a small example of intrinsic 
geometry. It shows that the idea of parallel transport is not sim- 
ple. For a general surface it is necessary to explicitly define what 
we mean by parallel. 

If you walked a smaller loop, the angle between the starting ori- 
entation and the ending orientation of the stick would be smaller. 
For small loops it would be proportional to the area of the loop 
you walked. This constant of proportionality is a measure of the 
curvature. The result does not depend on how fast you walked, 
so this is not a dynamical phenomenon. 

Denizens of the surface may play ball games. The balls are 
constrained to the surface; otherwise they are free particles. The 
paths of the balls are governed by dynamical laws. This motion 
is a solution of the Euler-Lagrange equations 1 for the free-particle 
Lagrangian with coordinates that incorporate the constraint of 
living in the surface. There are coefficients of terms in the Euler- 
Lagrange equations that arise naturally in the description of the 
behavior of the stick when walking loops on the surface, connecting 
the static shape of the surface with the dynamical behavior of the 
balls. It turns out that the dynamical evolution of the balls may 
be viewed as parallel transport of the ball’s velocity vector in the 
direction of the velocity vector. This motion by parallel transport 
of the velocity is called geodesic motion. 

So there are deep connections between the dynamics of particles 
and the geometry of the space that the particles move in. If we un- 
derstand this connection we can learn about dynamics by studying 
geometry and we can learn about geometry by studying dynam- 
ics. We enter dynamics with a Lagrangian and the associated 
Lagrange equations. Although this formulation exposes many im- 
portant features of the system, such as how symmetries relate to 


1 It is customary to shorten “Euler-Lagrange equations” to “Lagrange equa- 
tions.” We hope Leonhard Euler is not disturbed. 
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conserved quantities, the geometry is not apparent. But when we 
express the Lagrangian and the Lagrange equations in differential 
geometry language, geometric properties become apparent. In the 
case of systems with no potential energy the Euler-Lagrange equa- 
tions are equivalent to the geodesic equations on the configuration 
manifold. In fact, the coefficients of terms in the Lagrange equa- 
tions are Christoffel coefficients, which define parallel transport 
on the manifold. Let’s look into this a bit. 

Lagrange Equations 

We write the Lagrange equations in functional notation 2 as fol- 
lows: 

D(d 2 L o r[g]) - di L o T[q] = 0. 

In SICM [19], Section 1.6.3, we showed that a Lagrangian de- 
scribing the free motion of a particle subject to a coordinate- 
dependent constraint can be obtained by composing a free-particle 
Lagrangian with a function that describes how dynamical states 
transform given the coordinate transformation that describes the 
constraints. 

A Lagrangian for a free particle of mass m and velocity v is just 
its kinetic energy, mv 2 / 2. The procedure Lfree implements the 
free Lagrangian: 3 

(define ((Lfree mass) state) 

(* 1/2 mass (square (velocity state)))) 

For us the dynamical state of a system of particles is a tuple of 
time, coordinates, and velocities. The free-particle Lagrangian 
depends only on the velocity part of the state. 

For motion of a point constrained to move on the surface of 
a sphere the configuration space has two dimensions. We can 
describe the position of the point with the generalized coordi- 
nates colatitude and longitude. If the sphere is embedded in 3- 
dimensional space the position of the point in that space can be 


2 A short introduction to our functional notation, and why we have chosen it, 
is given in the prologue: Programming and Understanding. More details can 
be found in Appendix B. 

3 An informal description of the Scheme programming language can be found 
in Appendix A. 
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given by a coordinate transformation from colatitude and longi- 
tude to three rectangular coordinates. 

For a sphere of radius R the procedure sphere->R3 implements 
the transformation of coordinates from colatitude 9 and longitude 
4> on the surface of the sphere to rectangular coordinates in the 
embedding space. (The z axis goes through the North Pole, and 
the Equator is in the plane z = 0.) 


(define ((sphere->R3 R) state) 

(let ((q (coordinate state))) 

(let ((theta (ref q 0)) (phi (ref q 1))) 
(up (* R (sin theta) (cos phi)) ; 

(* R (sin theta) (sin phi)) ; 

(* R (cos theta)))))) ; 


x 

y 

z 


The coordinate transformation maps the generalized coordi- 
nates on the sphere to the 3-dimensional rectangular coordinates. 
Given this coordinate transformation we construct a correspond- 
ing transformation of velocities; these make up the state trans- 
formation. The procedure F->C implements the derivation of a 
transformation of states from a coordinate transformation: 

(define ((F->C F) state) 

(up (time state) 

(F state) 

(+ (((partial 0) F) state) 

(* (((partial 1) F) state) 

(velocity state))))) 

A Lagrangian governing free motion on a sphere of radius R is then 
the composition of the free Lagrangian with the transformation of 
states. 

(define (Lsphere m R) 

(compose (Lfree m) (F->C (sphere->R3 R)))) 

So the value of the Lagrangian at an arbitrary dynamical state is: 
((Lsphere ’m ’R) 

(up ’t (up ’theta ’phi) (up ’thetadot ’phidot))) 

(+ (* 1/2 m (expt R 2) (expt thetadot 2)) 

(* 1/2 m (expt R 2) (expt (sin theta) 2) (expt phidot 2))) 
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or, in infix notation: 

^mR 2 9 2 + ^mR 2 (sin (9)) 2 4> 2 . (1.1) 

The Metric 

Let’s now take a step into the geometry. A surface has a metric 
which tells us how to measure sizes and angles at every point on 
the surface. (Metrics are introduced in Chapter 9.) 

The metric is a symmetric function of two vector fields that 
gives a number for every point on the manifold. (Vector fields are 
introduced in Chapter 3). Metrics may be used to compute the 
length of a vector field at each point, or alternatively to compute 
the inner product of two vector fields at each point. For example, 
the metric for the sphere of radius R is 

g(u,v) = i? 2 d0(u)d0(v) + .R 2 (sin0) 2 d(/>(u)d</>(v), (1.2) 

where u and v are vector fields, and d 9 and dcj) are one-form fields 
that extract the named components of the vector- field argument. 
(One-form fields are introduced in Chapter 3.) We can think of 
d0(u) as a function of a point that gives the size of the vector field 
u in the 9 direction at the point. Notice that g(u, u) is a weighted 
sum of the squares of the components of u. In fact, if we identify 

d0(v) = 9 
d0(v) = <j), 

then the coefficients in the metric are the same as the coefficients 
in the value of the Lagrangian, equation (1.1), apart from a factor 
of m/2. 

We can generalize this result and write a Lagrangian for free 
motion of a particle of mass m on a manifold with metric g: 

L 2 (x,v ) = ^ \mg ij (x)v l v : > . (1.3) 

ij 

This is written using indexed variables to indicate components 
of the geometric objects expressed with respect to an unspecified 
coordinate system. The metric coefficients gij are, in general, a 
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function of the position coordinates x, because the properties of 
the space may vary from place to place. 

We can capture this geometric statement as a program: 

(define ((L2 mass metric) place velocity) 

(* 1/2 mass ((metric velocity velocity) place))) 

This program gives the Lagrangian in a coordinate-independent, 
geometric way. It is entirely in terms of geometric objects, such as 
a place on the configuration manifold, the velocity at that place, 
and the metric that describes the local shape of the manifold. 
But to compute we need a coordinate system. We express the 
dynamical state in terms of coordinates and velocity components 
in the coordinate system. For each coordinate system there is 
a natural vector basis and the geometric velocity vectors can be 
constructed by contracting the basis with the components of the 
velocity. Thus, we can form a coordinate representation of the 
Lagrangian. 

(define ((Lc mass metric coordsys) state) 

(let ((x (coordinates state)) (v (velocities state)) 

(e (coordinate-system->vector-basis coordsys))) 

((L2 mass metric) ((point coordsys) x) (* e v)))) 

The manifold point m represented by the coordinates x is given 
by (define m ((point coordsys) x)). The coordinates of m in a 
different coordinate system are given by (( chart coordsys2) m). 
The manifold point m is a geometric object that is the same point 
independent of how it is specified. Similarly, the velocity vector ev 
is a geometric object, even though it is specified using components 
v with respect to the basis e. Both v and e have as many compo- 
nents as the dimension of the space so their product is interpreted 
as a contraction. 

Let’s make a general metric on a 2-dimensional real manifold: 4 
(define the-metric (literal-metric ’g R2-rect)) 


4 The procedure literal -metric provides a metric. It is a general symmetric 
function of two vector fields, with literal functions of the coordinates of the 
manifold points for its coefficients in the given coordinate system. The quoted 
symbol ’g is used to make the names of the literal coefficient functions. Literal 
functions are discussed in Appendix B. 
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The metric is expressed in rectangular coordinates, so the coordi- 
nate system is R2-rect. 5 The component functions will be labeled 
as subscripted gs. 

We can now make the Lagrangian for the system: 

(define L (Lc ’m the-metric R2-rect)) 

And we can apply our Lagrangian to an arbitrary state: 

(L (up ’t (up ’x ’y) (up ’vx ’vy))) 

(+ (* 1/2 m (g.00 (up x y) ) (expt vx 2 )) 

(* m (g.Ol (up x y) ) vx vy) 

(* 1/2 m (g.ll (up x y) ) (expt vy 2))) 


Compare this result with equation (1.3). 

Euler-Lagrange Residuals 

The Euler-Lagrange equations are satisfied on realizable paths. 
Let 7 be a path on the manifold of configurations. (A path is a 
map from the 1-dimensional real line to the configuration mani- 
fold. We introduce maps between manifolds in Chapter 6.) Con- 
sider an arbitrary path: 6 

(define gamma (literal-manifold-map ’q Rl-rect R2-rect)) 

The values of 7 are points on the manifold, not a coordinate repre- 
sentation of the points. We may evaluate gamma only on points of 
the real-line manifold; gamma produces points on the R 2 manifold. 
So to go from the literal real-number coordinate ’t to a point 
on the real line we use ((point Rl-rect) ’t) and to go from 
a point m in R 2 to its coordinate representation we use ((chart 
R2-rect) m). (The procedures point and chart are introduced in 
Chapter 2.) Thus 


s R2-rect is the usual rectangular coordinate system on the 2-dimensional real 
manifold. (See Section 2.1, page 13.) We supply common coordinate systems 
for n-dimensional real manifolds. For example, R2-polar is a polar coordinate 
system on the same manifold. 

6 The procedure literal -manifold-map makes a map from the manifold im- 
plied by its second argument to the manifold implied by the third argument. 
These arguments must be coordinate systems. The quoted symbol that is the 
first argument is used to name the literal coordinate functions that define the 
map. 
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((chart R2-rect) (gamma ((point Rl-rect) ’t))) 

(up (q~0 t) (q~l t>) 

So, to work with coordinates we write: 

(define coordinate-path 

(compose (chart R2-rect) gamma (point Rl-rect))) 

(coordinate-path ’t) 

(up (q~0 t) (q~l t)) 

Now we can compute the residuals of the Euler-Lagrange equa- 
tions, but we get a large messy expression that we will not show." 
However, we will save it to compare with the residuals of the 
geodesic equations. 

(define Lagrange-residuals 

( ( (Lagrange-equations L) coordinate-path) ’t)) 

Geodesic Equations 

Now we get deeper into the geometry. The traditional way to 
write the geodesic equations is 

V v v = 0 (1.4) 

where V is a covariant derivative operator. Roughly, V v w is a 
directional derivative. It gives a measure of the variation of the 
vector field w as you walk along the manifold in the direction of v. 
(We will explain this in depth in Chapter 7.) V v v = 0 is intended 
to convey that the velocity vector is parallel-transported by itself. 
When you walked East on the Equator you had to hold the stick so 
that it was parallel to the Equator. But the stick is constrained to 
the surface of the Earth, so moving it along the Equator required 
turning it in three dimensions. The V thus must incorporate the 
3-dimensional shape of the Earth to provide a notion of “paral- 
lel” appropriate for the denizens of the surface of the Earth. This 
information will appear as the “Christoffel coefficients” in the co- 
ordinate representation of the geodesic equations. 

The trouble with the traditional way to write the geodesic equa- 
tions (1.4) is that the arguments to the covariant derivative are 


7 For an explanation of equation residuals see page xvi. 
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vector fields and the velocity along the path is not a vector field. 
A more precise way of stating this relation is: 

X7l, at d^d/d t) = 0. (1.5) 

(We know that this may be unfamiliar notation, but we will ex- 
plain it in Chapter 7.) 

In coordinates, the geodesic equations are expressed 
D\\t) + ^rj fc ( 7 (t))Dq\t)Dq k (t) = 0, (1.6) 

jk 


where q(t) is the coordinate path corresponding to the manifold 
path 7 , and r* fc (m) are Christoffel coefficients. The r* fc (m) de- 
scribe the “shape” of the manifold close to the manifold point m. 
They can be derived from the metric g. 

We can get and save the geodesic equation residuals by: 

(define geodesic-equation-residuals 

(((((covariant-derivative Cartan gamma) d/dt) 

((differential gamma) d/dt)) 

(chart R2-rect)) 

((point Rl-rect) ’t))) 

where d/dt is a vector field on the real line 8 and Cartan is a 
way of encapsulating the geometry, as specified by the Christoffel 
coefficients. The Christoffel coefficients are computed from the 
metric: 

(define Cartan 

(Christoff el->Cart an 
(metric->Christ off el-2 the-metric 

(coordinate-system->basis R2-rect) ) ) ) 

The two messy residual results that we did not show are related 
by the metric. If we change the representation of the geodesic 
equations by “lowering” them using the mass and the metric, we 
see that the residuals are equal: 


8 We established t as a coordinate function on the rectangular coordinates of 
the real line by 

(def ine-coordinates t Rl-rect) 

This had the effect of also defining d/dt as a coordinate vector field and dt as 
a one- form field on the real line. 
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(define metric-components 

(metric->components the-metric 

(coordinate-system->basis R2-rect) ) ) 


(- Lagrange-residuals 

(* (* ’m (metric-components (gamma ((point Rl-rect) ’t)))) 
geodesic-equation-residuals) ) 

( down 00) 


This establishes that for a 2-dimensional space the Euler-Lagrange 
equations are equivalent to the geodesic equations. The Christof- 
fel coefficients that appear in the geodesic equation correspond to 
coefficients of terms in the Euler-Lagrange equations. This anal- 
ysis will work for any number of dimensions (but will take your 
computer longer in higher dimensions, because the complexity in- 
creases). 

Exercise 1.1: Motion on a Sphere 

The metric for a unit sphere, expressed in colatitude 0 and longitude <p, 
is 

g(u,v) = d$(u)d$(v) + (sin$) 2 d<?!>(u)d</>(v). 

Compute the Lagrange equations for motion of a free particle on the 
sphere and convince yourself that they describe great circles. For exam- 
ple, consider motion on the equator ( 0 = 7r/2) and motion on a line of 
longitude (0 is constant). 
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Manifolds 


A manifold is a generalization of our idea of a smooth surface 
embedded in Euclidean space. For an n-dimensional manifold, 
around every point there is a simply-connected open set, the coor- 
dinate patch , and a one-to-one continuous function, the coordinate 
function or chart , mapping every point in that open set to a tuple 
of n real numbers, the coordinates. In general, several charts are 
needed to label all points on a manifold. It is required that if a 
region is in more than one coordinate patch then the coordinates 
are consistent in that the function mapping one set of coordinates 
to another is continuous (and perhaps differentiable to some de- 
gree). A consistent system of coordinate patches and coordinate 
functions that covers the entire manifold is called an atlas. 

An example of a 2-dimensional manifold is the surface of a 
sphere or of a coffee cup. The space of all configurations of a planar 
double pendulum is a more abstract example of a 2-dimensional 
manifold. A manifold that looks locally Euclidean may not look 
like Euclidean space globally: for example, it may not be simply 
connected. The surface of the coffee cup is not simply connected, 
because there is a hole in the handle for your fingers. 

An example of a coordinate function is the function that maps 
points in a simply-connected open neighborhood of the surface 
of a sphere to the tuple of latitude and longitude. 1 If we want 
to talk about motion on the Earth, we can identify the space of 
configurations to a 2-sphere (the surface of a 3-dimensional ball). 
The map from the 2-sphere to the 3-dimensional coordinates of a 
point on the surface of the Earth captures the shape of the Earth. 

Two angles specify the configuration of the planar double pen- 
dulum. The manifold of configurations is a torus, where each 
point on the torus corresponds to a configuration of the double 
pendulum. The constraints, such as the lengths of the pendu- 
lum rods, are built into the map between the generalized coordi- 


1 The open set for a latitude- longitude coordinate system cannot include either 
pole (because longitude is not defined at the poles) or the 180° meridian (where 
the longitude is discontinuous). Other coordinate systems are needed to cover 
these places. 
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nates of points on the torus and the arrangements of masses in 
3-dimensional space. 

There are computational objects that we can use to model man- 
ifolds. For example, we can make an object that represents the 
plane 2 

(define R2 (make-manifold R"n 2)) 

and give it the name R2. One useful patch of the plane is the one 
that contains the origin and covers the entire plane. 3 

(define U (patch ’origin R2)) 


2.1 Coordinate Functions 

A coordinate function x maps points in a coordinate patch of a 
manifold to a coordinate tuple: 4 

x = x( m )j (2-1) 

where x may have a convenient tuple structure. Usually, the co- 
ordinates are arranged as an “up structure”; the coordinates are 
selected with superscripts: 

x l = X*(m). (2.2) 

The number of independent components of x is the dimension of 
the manifold. 

Assume we have two coordinate functions x and x' ■ The coor- 
dinate transformation from x' coordinates to x coordinates is just 
the composition x ° X ,-1 > where x'~ l is the functional inverse of 
x' (see figure 2.1). We assume that the coordinate transformation 
is continuous and differentiable to any degree we require. 


2 The expression R~n gives only one kind of manifold. We also have spheres 
STi and SQ3. 

3 The word origin is an arbitrary symbol here. It labels a predefined patch in 
R~n manifolds. 

4 In the text that follows we will use sans-serif names, such as f, v, m, to refer 
to objects defined on the manifold. Objects that are defined on coordinates 
(tuples of real numbers) will be named with symbols like /, v, x. 
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Figure 2.1 Here there are two overlapping coordinate patches that are 
the domains of the two coordinate functions x and x' ■ It is possible to 
represent manifold points in the overlap using either coordinate system. 
The coordinate transformation from x' coordinates to \ coordinates is 
just the composition % o x' _1 . 


Given a coordinate system coordsys for a patch on a manifold 
the procedure that implements the function x that gives coordi- 
nates for a point is (chart coordsys). The procedure that imple- 
ments the inverse map that gives a point for coordinates is (point 
coordsys) . 

We can have both rectangular and polar coordinates on a patch 
of the plane identified by the origin: 5,6 


; ; Some charts on the patch U 

(define R2-rect (coordinate-system ’rectangular U)) 

(define R2-polar (coordinate-system ’polar /cylindrical U)) 

For each of the coordinate systems above we obtain the coordi- 
nate functions and their inverses: 


*The rectangular coordinates are good for the entire plane, but the polar 
coordinates are singular at the origin because the angle is not defined. Also, 
the patch for polar coordinates must exclude one ray from the origin, because 
of the angle variable. 

6 We can avoid explicitly naming the patch: 

(define R2-rect (coordinate-system-at ’rectangular ’origin R2)) 
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(define R2-rect-chi (chart R2-rect)) 

(define R2-rect-chi-inverse (point R2-rect)) 

(define R2-polar-chi (chart R2-polar)) 

(define R2-polar-chi-inverse (point R2-polar)) 

The coordinate transformations are then just compositions. The 
polar coordinates of a rectangular point are: 

((compose R2-polar-chi R2-rect-chi-inverse) 

(up ’xO ’yO)) 

(up (sqrt ( + (expt xO 2 ) (expt yO 2))) (atari yO xO ) ) 


And the rectangular coordinates of a polar point are: 

((compose R2-rect-chi R2-polar-chi-inverse) 

(up ’rO ’thetaO)) 

(up ( * rO (cos thetaO)) ( * rO (sin thetaO))) 


And we can obtain the Jacobian of the polar-to-rectangular trans- 
formation by taking its derivative: 7 

( (D (compose R2-rect-chi R2-polar-chi-inverse) ) 

(up ’rO ’thetaO)) 

(down (up (cos thetaO) (sin thetaO)) 

(up (* -1 rO (sin thetaO)) ( * rO (cos thetaO)))) 


2.2 Manifold Functions 

Let f be a real- valued function on a manifold M: this function 
maps points m on the manifold to real numbers. 

This function has a coordinate representation f x with respect 
to the coordinate function x (see figure 2.2): 

/x = fo X _1 - (2-3) 

Both the coordinate representation f x and the tuple x depend 
on the coordinate system, but the value f x {x ) is independent of 
coordinates: 

/ x (x) = (f °x^ 1 )(x(m)) = f(m). (2.4) 


7 See Appendix B for an introduction to tuple arithmetic and a discussion of 
derivatives of functions with structured input or output. 
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Figure 2.2 The coordinate function \ maps points on the manifold 
in the coordinate patch to a tuple of coordinates. A function f on the 
manifold M can be represented in coordinates by a function f x = f oyW 

The subscript x may be dropped when it is unambiguous. 

For example, in a 2-dinrensional real manifold the coordinates 
of a manifold point m are a pair of real numbers, 

(x,y)=x ( m ), (2.5) 

and the manifold function f is represented in coordinates by a 
function / that takes a pair of real numbers and produces a real 
number 

/ : R 2 — » R 

/ : (x,y) e-t f(x,y). (2.6) 

We define our manifold function 
f : M -> R 

f : m ■-> (/ox)(m). (2.7) 

Manifold Functions Are Coordinate Independent 

We can illustrate the coordinate independence with a program. 
We will show that an arbitrary manifold function f, when defined 
by its coordinate representation in rectangular coordinates, has 
the same behavior when applied to a manifold point independent 
of whether the point is specified in rectangular or polar coordi- 
nates. 
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We define a manifold function by specifying its behavior in rect- 
angular coordinates: 8 

(define f 

(compose (literal-function ’f-rect R2->R) R2-rect-chi) ) 

where R2->R is a signature for functions that map an up structure 
of two reals to a real: 

(define R2->R (-> (UP Real Real) Real)) 

We can specify a typical manifold point using its rectangular co- 
ordinates: 

(define R2-rect-point (R2-rect-chi-inverse (up ’xO ’yO))) 

We can describe the same point using its polar coordinates: 

(def ine corresponding-polar-point 
(R2-polar-chi-inverse 
(up (sqrt (+ (square ’xO) (square ’yO))) 

(atan ’yO ’xO)))) 

(f R2-rect-point) and (f corresponding-polar-point) agree, 
even though the point has been specified in two different coordi- 
nate systems: 

(f R2-rect-point) 

(f-rect (up xO yO ) ) 

(f corresponding-polar-point) 

(f-rect (up xO yO ) ) 

Naming Coordinate Functions 

To make things a bit easier, we can give names to the individual 
coordinate functions associated with a coordinate system. Here we 
name the coordinate functions for the R2-rect coordinate system 
x and y and for the R2-polar coordinate system r and theta. 

(def ine-coordinates (up x y) R2-rect) 

(def ine-coordinates (up r theta) R2-polar) 


8 Alternatively, we can define the same function in a shorthand 
(define f (literal -manifold-function ’f-rect R2-rect)) 




2.2 Manifold Functions 


17 


This allows us to extract the coordinates from a point, indepen- 
dent of the coordinate system used to specify the point. 

(x (R2-rect-chi-inverse (up ’xO ’yO))) 
xO 

(x (R2-polar-chi-inverse (up ’rO ’thetaO))) 

( * rO (cos thetaO)) 

(r (R2-polar-chi-inverse (up ’rO ’thetaO))) 
rO 

(r (R2-rect-chi-inverse (up ’xO ’yO))) 

(sqrt ( + (expt xO 2 ) (expt yO 2))) 

(theta (R2-rect-chi-inverse (up ’xO ’yO))) 

( atan yO xO) 


We can work with the coordinate functions in a natural manner, 
defining new manifold functions in terms of them: 9 

(define h (+ (* x (square r)) (cube y))) 

(h R2-rect-point) 

( + ( expt xO 3 ) ( * xO ( expt yO 2) ) 

( expt yO 3) ) 

We can also apply h to a point defined in terms of its polar coor- 
dinates: 

(h (R2-polar-chi-inverse (up ’rO ’thetaO))) 

(+ (* (expt rO 3) (expt (sin thetaO) 3)) 

(* (expt rO 3) (cos thetaO))) 

Exercise 2.1: Curves 

A curve may be specified in different coordinate systems. For example, a 
cardioid constructed by rolling a circle of radius a around another circle 
of the same radius is described in polar coordinates by the equation 

r = 2a(l + cos(0)). 


9 This is actually a nasty, but traditional, abuse of notation. An expression 
like cos(r) can either mean the cosine of the angle r (if r is a number), or the 
composition cos or (if r is a function). In our system (cos r) behaves in this 
way — either computing the cosine of r or being treated as (compose cos r) 
depending on what r is. 
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We can convert this to rectangular coordinates by evaluating the residual 
in rectangular coordinates. 

(def ine-coordinates (up r theta) R2-polar) 

((- r (* 2 ’a (+ 1 (cos theta)))) 

((point R2-rect) (up ’x ’y))) 

(/(+(* -2 a x) 

(* -2 a (sqrt (+ (expt x 2) (expt y 2 )))) 

( expt x 2) ( expt y 2) ) 

(sqrt (+ (expt x 2) (expt y 2)))) 

The numerator of this expression is the equivalent residual in rectangular 
coordinates. If we rearrange terms and square it we get the traditional 
formula for the cardioid 

(. x 2 + y 2 - 2 axf = 4a 2 ( x 2 + y 2 ). 

a. The rectangular coordinate equation for the Lemniscate of Bernoulli 
is 

(x 2 + y 2 ) 2 = 2a 2 {x 2 — y 2 ). 

Find the expression in polar coordinates. 

b. Describe a helix space curve in both rectangular and cylindrical co- 
ordinates. Use the computer to show the correspondence. Note that we 
provide a cylindrical coordinate system on the manifold R 3 for you to 
use. It is called R3-cyl; with coordinates (r, theta, z). 

Exercise 2.2: Stereographic Projection 

A stereographic projection is a correspondence between points on the 
unit sphere and points on the plane cutting the sphere at its equator. 
(See figure 2.3.) 

The coordinate system for points on the sphere in terms of rectan- 
gular coordinates of corresponding points on the plane is S2-Riemann. 1() 
The procedure (chart S2-Riemann) gives the rectangular coordinates 
on the plane for every point on the sphere, except for the North Pole. 
The procedure (point S2-Riemann) gives the point on the sphere given 
rectangular coordinates on the plane. The usual spherical coordinate 
system on the sphere is S2-spherical. 

We can compute the colatitude and longitude of a point on the sphere 
corresponding to a point on the plane with the following incantation: 


10 The plane with the addition of a point at infinity is conformally equivalent to 
the sphere by this correspondence. This correspondence is called the Riemann 
sphere, in honor of the great mathematician Bernard Riemann (1826-1866), 
who made major contributions to geometry. 
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Figure 2.3 For each point on the sphere (except for its north pole) 
a line is drawn from the north pole through the point and extending to 
the equatorial plane. The corresponding point on the plane is where the 
line intersects the plane. The rectangular coordinates of this point on 
the plane are the Riemann coordinates of the point on the sphere. The 
points on the plane can also be specified with polar coordinates (p. 9) 
and the points on the sphere are specified both by Riemann coordinates 
and the traditional colatitude and longitude (<j>. A). 


( (compose 

(chart S2-spherical) 

(point S2-Riemann) 

(chart R2-rect) 

(point R2-polar)) 

(up ’rho ’theta)) 

(up (acos (/ (+ -1 (expt rho 2)) 

(+ +1 (expt rho 2)))) 

theta ) 

Perform an analogous computation to get the polar coordinates of the 
point on the plane corresponding to a point on the sphere given by its 
colatitude and longitude. 
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We want a way to think about how a function varies on a mani- 
fold. Suppose we have some complex linkage, such as a multiple 
pendulum. The potential energy is an important function on the 
multi-dimensional configuration manifold of the linkage. To un- 
derstand the dynamics of the linkage we need to know how the 
potential energy changes as the configuration changes. The change 
in potential energy for a step of a certain size in a particular di- 
rection in the configuration space is a real physical quantity; it 
does not depend on how we measure the direction or the step size. 
What exactly this means is to be determined: What is a step size? 
What is a direction? We cannot subtract two configurations to 
determine the distance between them. It is our job here to make 
sense of this idea. 

So we would like something like a derivative, but there are prob- 
lems. Since we cannot subtract two manifold points, we cannot 
take the derivative of a manifold function in the way described 
in elementary calculus. But we can take the derivative of a co- 
ordinate representation of a manifold function, because it takes 
real-number coordinates as its arguments. This is a start, but it 
is not independent of coordinate system. Let’s see what we can 
build out of this. 


3.1 Vector Fields 

In multiple dimensions the derivative of a function is the multiplier 
for the best linear approximation of the function at each argument 
point: 1 

f(x + Ax) & f(x) + ( Df(x))Ax (3.1) 

The derivative Df[x) is independent of Ax. Although the deriva- 
tive depends on the coordinates, the product ( Df(x))Ax is in- 


1 In multiple dimensions the derivative Df(x) is a down tuple structure of 
the partial derivatives and the increment Aa; is an up tuple structure, so the 
indicated product is to be interpreted as a contraction. (See equation B.8.) 



Chapter 3 Vector Fields and One-Form Fields 


22 


variant under change of coordinates in the following sense. Let 
4> = X° X /_1 be a coordinate transformation, and x = 4>(y). Then 
Ax = Dcj)(y)Ay is the linear approximation to the change in x 
when y changes by Ay. If / and g are the representations of a 
manifold function in the two coordinate systems, g(y) = f(fi(y)) = 
/(x), then the linear approximations to the increments in / and 
g are equal: 

Dg(y)Ay = Df{fi(y)) {D<f>(y)Ay) = Df(x) Ax. 

The invariant product (Df(x))Ax is the directional derivative 
of / at x with respect to the vector specified by the tuple of 
components Ax in the coordinate system. We can generalize this 
idea to allow the vector at each point to depend on the point, 
making a vector field. Let b be a function of coordinates. We then 
have a directional derivative of / at each point x, determined by b 

D b (f)(x) = (Df(x))b(x). (3.2) 

Now we bring this back to the manifold and develop a useful gen- 
eralization of the idea of directional derivative for functions on a 
manifold, rather than functions on R n . A vector field on a man- 
ifold is an assignment of a vector to each point on the manifold. 
In elementary geometry, a vector is an arrow anchored at a point 
on the manifold with a magnitude and a direction. In differential 
geometry, a vector is an operator that takes directional deriva- 
tives of manifold functions at its anchor point. The direction and 
magnitude of the vector are the direction and scale factor of the 
directional derivative. 

Let m be a point on a manifold, v be a vector held on the man- 
ifold, and f be a real-valued function on the manifold. Then v(f) 
is the directional derivative of the function f and v(f)(m) is the 
directional derivative of the function f at the point m. The vector 
held is an operator that takes a real- valued manifold function and 
a manifold point and produces a number. The order of arguments 
is chosen to make v(f) be a new manifold function that can be 
manipulated further. Directional derivative operators, unlike or- 
dinary derivative operators, produce a result of the same type as 
their argument. Note that there is no mention here of any coordi- 
nate system. The vector held specihes a direction and magnitude 
at each manifold point that is independent of how it is described 
using any coordinate system. 
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A useful way to characterize a vector field in a particular coor- 
dinate system is by applying it to the coordinate functions. The 
resulting functions b xy are called the coordinate component func- 
tions or coefficient functions of the vector field; they measure how 
quickly the coordinate functions change in the direction of the 
vector field, scaled by the magnitude of the vector field: 

b x,y = v M°X~ 1 - (3-3) 

Note that we have chosen the coordinate components to be func- 
tions of the coordinate tuple, not of a manifold point. 

A vector with coordinate components b xy applies to a manifold 
function f via 

v ( f )( m ) = (W ° X _1 ) & x ,v) ° X)(m) (3.4) 

= D(f ox _1 )(x(m)) & x ,v(x(m)) (3.5) 

= di ( f ° X _1 )(x(m)) 6^v(x(m)). (3.6) 

i 

In equation (3.4), the quantity fc>x -1 is the coordinate representa- 
tion of the manifold function f . We take its derivative, and weight 
the components of the derivative with the coordinate components 
b xy of the vector field that specify its direction and magnitude. 
Since this product is a function of coordinates we use x to extract 
the coordinates from the manifold point m. In equation (3.5), the 
composition of the product with the coordinate chart % is replaced 
by function evaluation. In equation (3.6) the tuple multiplication 
is expressed explicitly as a sum of products of corresponding com- 
ponents. So the application of the vector is a linear combination 
of the partial derivatives of f in the coordinate directions weighted 
by the vector components. This computes the rate of change of f 
in the direction specified by the vector. 

Equations (3.3) and (3.5) are consistent: 

v (x)(x _1 (aO) = D(x°X~ l )(x) b xy (x) 

= D(I)(x) b xy (x) 

= 6 X)V (x). (3.7) 

The coefficient tuple b xy (x) is an up structure compatible for 
addition to the coordinates. Note that for any vector field v the co- 
efficients b xy (x) are different for different coordinate functions x- 
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In the text that follows we will usually drop the subscripts on b, 
understanding that it is dependent on the coordinate system and 
the vector field. 

We implement the definition of a vector field (3.4) as: 

(define (components->vector-f ield components coordsys) 

(define (v f) 

(compose (* (D (compose f (point coordsys))) 
components) 

(chart coordsys))) 

(procedure->vector-f ield v)) 

The vector field is an operator, like derivative. 2 

Given a coordinate system and coefficient functions that map 
coordinates to real values, we can make a vector field. For exam- 
ple, a general vector field can be defined by giving components 
relative to the coordinate system R2-rect by 

(define v 

( component s->vector-f ield 
(up (literal-function ’b"0 R2->R) 

(literal-function ’b"l R2->R)) 

R2-rect) ) 

To make it convenient to define literal vector fields we provide 
a shorthand: (define v (literal-vector-field ’b R2-rect)) 
This makes a vector field with component functions named b~0 
and b~l and names the result v. When this vector held is applied 
to an arbitrary manifold function it gives the directional deriva- 
tive of that manifold function in the direction specified by the 
components b~0 and b"l: 

((v (literal -manifold-function ’f-rect R2-rect)) R2-rect -point) 
( + (* (( (partial 0) f-rect) (up xO yO ) ) (b~0 (up xO yO))) 

(* (( (partial 1) f-rect) (up xO yO)) (b~l (up xO yO)))) 

This result is what we expect from equation (3.6). 

We can recover the coordinate components of the vector held 
by applying the vector held to the coordinate chart: 


2 An operator is just like a procedure except that multiplication is interpreted 
as composition. For example, the derivative procedure is made into an oper- 
ator D so that we can say (expt D 2) and expect it to compute the second 
derivative. The procedure procedure->vector-f ield makes a vector-field op- 
erator. 
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((v (chart R2-rect)) R2-rect-point) 
(up (b~0 (up x y ) ) (b~l (up x y ) ) ) 


Coordinate Representation 

The vector field v has a coordinate representation v: 

v(f)(m) = D( f o x _1 )(x(m)) 6(x( m)) 

= Df(x ) 6(x) 

= v(f)(x), (3.8) 

with the definitions / = f o x _1 and a; = x( m )- The function b is 
the coefficient function for the vector field v. It provides a scale 
factor for the component in each coordinate direction. However, v 
is the coordinate representation of the vector field v in that it takes 
directional derivatives of coordinate representations of manifold 
functions. 

Given a vector field v and a coordinate system coordsys we can 
construct the coordinate representation of the vector field . 3 

(define (coordinatize v coordsys) 

(define ( (coordinatized-v f) x) 

(let ((b (compose (v (chart coordsys)) 

(point coordsys)))) 

(* ((D f) x) (b x))))) 

(make-operator coordinatized-v)) 

We can apply a coordinatized vector field to a function of coordi- 
nates to get the same answer as before. 

(((coordinatize v R2-rect) (literal-function ’f-rect R2->R)) 
(up ’xO ’yO)) 

( + (* (( (partial 0) f-rect) (up xO yO)) (b~0 (up xO yO))) 

(* (( (partial 1) f-rect) (up xO yO)) (b~l (up xO yO) )) ) 

Vector Field Properties 

The vector fields on a manifold form a vector space over the field 
of real numbers and a module over the ring of real- valued manifold 
functions. A module is like a vector space except that there is no 
multiplicative inverse operation on the scalars of a module. Man- 
ifold functions that are not the zero function do not necessarily 


3 The make-operator procedure takes a procedure and returns an operator. 
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have multiplicative inverses, because they can have isolated zeros. 
So the manifold functions form a ring, not a held, and vector Helds 
must be a module over the ring of manifold functions rather than 
a vector space. 

Vector Helds have the following properties. Let u and v be 
vector fields and let a be a real-valued manifold function. Then 

(u + v)(f) = u(f) + v(f) (3.9) 

(au)(f) = a(u(f)). (3.10) 

Vector fields are linear operators. Assume f and g are functions 
on the manifold, a and b are real constants. 4 The constants a and 
b are not manifold functions, because vector fields take derivatives. 


See equation (3.13). 

v(af + 6g)(m) = av(f)(m) + 6v(g)(m) (3.11) 

v(af)(m) = av(f)(m) (3-12) 

Vector fields satisfy the product rule (Leibniz rule). 

v(fg)(m) = v(f)(m) g(m) + f(m) v(g)(m) (3.13) 

Vector fields satisfy the chain rule. Let F be a function on the 
range of f. 

v(Fof)(m) = DF(f(m))v(f)(m) (3-14) 


3.2 Coordinate-Basis Vector Fields 

For an n-dimensional manifold any set of n linearly independent 
vector fields 5 form a basis in that any vector field can be expressed 
as a linear combination of the basis fields with manifold-function 


4 If f has structured output then v(f) is the structure resulting from v being 
applied to each component of f. 

5 A set of vector fields, {vi}, is linearly independent with respect to manifold 
functions if we cannot find nonzero manifold functions, {a^ } , such that 

= 0(f), 

i 

where 0 is the vector field such that 0(f)(m) = 0 for all f and m. 
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coefficients. Given a coordinate system we can construct a ba- 
sis as follows: we choose the component tuple bi(x) (see equa- 
tion 3.5) to be the ith unit tuple Ui(x ) — an up tuple with one 
in the ith position and zeros in all other positions — selecting the 
partial derivative in that direction. Here iq is a constant function. 
Like b, it formally takes coordinates of a point as an argument, 
but it ignores them. We then define the basis vector field X,; by 

X*(f)(m) = D(f ox _1 )(x(m)) «i(xM) 

= ^(f0X _1 )(xH). (3.15) 

In terms of X,; the vector field of equation (3.6) is 
v (0( m ) = & *(x( m ))- (3-16) 

i 

We can also write 

v(f)(m) = X(f)(m) b( X ( m)), (3.17) 

letting the tuple algebra do its job. 

The basis vector held is often written 

5 ? - x " (3 ' 18) 

to call to mind that it is an operator that computes the directional 
derivative in the ith coordinate direction. 

In addition to making the coordinate functions, the procedure 
def ine-coordinates also makes the traditional named basis vec- 
tors. Using these we can examine the application of a rectangular 
basis vector to a polar coordinate function: 

(def ine-coordinates (up x y) R2-rect) 

(def ine-coordinates (up r theta) R2-polar) 

((d/dx (square r)) R2-rect -point) 

(* 2 xO ) 

More general functions and vectors can be made as combinations 
of these simple pieces: 

(((+ d/dx (* 2 d/dy)) (+ (square r) (* 3 x) ) ) R2-rect -point) 
(+ 3 (* 2 xO) (* 4 yO) ) 
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Coordinate Transformations 

Consider a coordinate change from the chart y to the chart yh 

X(f)(m) = D( f oy _1 )(y(m)) 

= D(f o (y')" 1 ° x' ° X _1 )(x( m )) 

= D ( f o (x') _1 )(x , ( m ))( D (x ' 0 x _1 ))(x( m )) 

= X'(f)( m )C D (x' 0 X _1 ))(x(m)). (3.19) 

This is the rule for the transformation of basis vector fields. The 
second factor can be recognized as Ll dx'/dx,” the Jacobian. 6 

The vector field does not depend on coordinates. So, from 
equation (3.17), we have 


v(f)(m) = X(f)(m) 6(x(m)) = X'(f)(m) 6 , (x'(m)). 

(3.20) 

Using equation (3.19) with x = x( m ) an d x ' = X^ 171 ^ 

), we deduce 

%'°X _1 )W b(x) = b'{x'). 

(3.21) 

Because x' °X 1 is tl ie inverse function of yo (y') -1 , 

their deriva- 

fives are multiplicative inverses, 


D( X ' o X _1 )(®) = (D( X ° (x')” 1 )^')) -1 , 

(3.22) 

and so 


b{x) = D{x o (x') _1 )(^) b'(x'), 

(3.23) 

as expected.' 



It is traditional to express this rule by saying that the basis 
elements transform covariantly and the coefficients of a vector in 


6 This notation helps one remember the transformation rule: 

df df dx ,J 

d x i 2-j g x lj Q x i ’ 

J 

which is the relation in the usual Leibniz notation. As Spivak pointed out in 
Calculus on Manifolds, p.45, / means something different on each side of the 
equation. 

7 For coordinate paths q and q' related by q(t) = (x°(x , ) _1 )(g , (t)) the velocities 
are related by Dq(t ) = D(\ ° ix') 1 )^' (t))Dq' (t) . Abstracting off paths, we 
get v = D(x o (x’y 1 )(x')v'. 
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terms of a basis transform contravariantly, their product is invari- 
ant under the transformation. 


3.3 Integral Curves 

A vector field gives a direction and rate for every point on a mani- 
fold. We can start at any point and go in the direction specified by 
the vector field, tracing out a parametric curve on the manifold. 
This curve is an integral curve of the vector field. 

More formally, let v be a vector field on the manifold M. An 
integral curve 7 ^ : R — » M of v is a parametric path on M satisfying 

= v ( f )(7m( i )) = ( v ( f )°7m)( i ) ( 3 - 24 ) 

7m (0) = m, (3.25) 

for arbitrary functions f on the manifold, with real values or struc- 
tured real values. The rate of change of a function along an inte- 
gral curve is the vector field applied to the function evaluated at 
the appropriate place along the curve. Often we will simply write 
7 , rather than 7 ^. Another useful variation is </>)(( m ) = 7m0)- 
We can recover the differential equations satisfied by a coor- 
dinate representation of the integral curve by letting f = y, the 
coordinate function, and letting a = y o 7 be the coordinate path 
corresponding to the curve 7 . Then the derivative of the coordi- 
nate path a is 

Da(t) = D(x o 7 )(f) 

= Mx) <>7)(i) 

= (v(y)oy _ 1 oyo 7 )(t) 

= (boa)(t), (3.26) 

where b = v(y) o y -1 is the coefficient function for the vector field 
v for coordinates y (see equation 3.7). So the coordinate path a 
satisfies the differential equations 

Da = bo a. (3.27) 

Differential equations for the integral curve can be expressed 
only in a coordinate representation, because we cannot go from 
one point on the manifold to another by addition of an increment. 
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However, we can do this by adding the coordinates to an increment 
of coordinates and then finding the corresponding point on the 
manifold. 

Iterating the process described by equation (3.24) we can com- 
pute higher-order derivatives of functions along the integral curve: 

D{ f o 7 ) = v(f) o 7 
D 2 ( f 07 ) = D(v(f) 07 )= v(v(f)) o 7 

D n (f o 7 ) = v n (f)o 7 (3.28) 

Thus, the evolution of f 07 can be written formally as a Taylor 
series in the parameter: 

( f °7 )(t) 

= (f o 7)(0) + t D( f o y)(0) + i t 2 D 2 { f o y)(0) 4 


= (e tD (f o 7 ))(0) 

= (e tv f)(7(0))- (3.29) 

Using cj) rather than 7 

( f °7m)(0 = ( f °$)( m )> (3-30) 

so, when the series converges, 

(e* v f)(m) = (fo^)(m). (3.31) 

In particular, let f = X, then 

<r(t) = (X° 7 )(t) = (e W (x 0 7))(0) = (e tv x)(7(0)), (3.32) 


a Taylor series representation of the solution to the differential 
equation (3.27). 

For example, a vector field circular that generates a rotation 
about the origin is : 8 

8 In this expression d/dx and d/dy are vector fields that take directional deriva- 
tives of manifold functions and evaluate them at manifold points; x and y are 
manifold functions, def ine-coordinates was used to create these operators 
and functions, see page 27. 

Note that circular is an operator — a property inherited from d/dx and 
d/dy. 
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(define circular (- (* x d/dy) (* y d/dx))) 

We can exponentiate the circular vector field, to generate an 
evolution in a circle around the origin starting at (1, 0): 

(series : f or-each print-expression 

(((exp (* ’t circular)) (chart R2-rect)) 
((point R2-rect) (up 1 0))) 

6 ) 

(up 1 0) 

(up 0 t) 

(up (* -1/2 (expt t 2)) 0) 

(up 0 (* -1/6 (expt t 3))) 

(up (* 1/24 (expt t 4)) 0) 

(up 0 (* 1/120 (expt t 5))) 

These are the first six terms of the series expansion of the coordi- 
nates of the position for parameter t. 

We can define an evolution operator EAt,v using equation (3.31) 

(EAt,vf)(m) = (e Atv f)(m) = (f o (3.33) 

We can approximate the evolution operator by summing the 
series up to a given order: 

(define ((((evolution order) delta-t v) f) m) 

(series : sum 

(((exp (* delta-t v)) f) m) 
order) ) 

We can evolve circular from the initial point up to the parame- 
ter t, and accumulate the first six terms as follows: 

((((evolution 6) ’delta-t circular) (chart R2-rect)) 

((point R2-rect) (up 1 0))) 

(up (+ (* -1/720 (expt delta-t 6)) 

(* 1/24 (expt delta-t 4)) 

(* -1/2 (expt delta-t 2)) 

1) 

(+ (* 1/120 (expt delta-t 5)) 

(* -1/6 (expt delta-t 3)) 
delta-t ) ) 

Note that these are just the series for cos A t and sin At, so the 
coordinate tuple of the evolved point is (cos At, sin At). 
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For functions whose series expansions have finite radius of con- 
vergence, evolution can progress beyond the point at which the 
Taylor series converges because evolution is well defined whenever 
the integral curve is defined. 

Exercise 3.1: State Derivatives 

Newton’s equations for the motion of a particle in a plane, subject to 
a force that depends only on the position in the plane, are a system 
of second-order differential equations for the rectangular coordinates 
( X , Y) of the particle: 

D 2 X(t) = A x (X(t),Y(t)) and D 2 Y{t) = A y (X(t),Y(t)), 

where A is the acceleration of the particle. 

These are equivalent to a system of first-order equations for the coor- 
dinate path a = x ° 7, where % = (t, x, y, v x , v y ) is a coordinate system 
on the manifold R 5 . Then our equations are: 

D(t o 7) = 1 
D(x 07) = m x 07 
D(y o 7) = m v o 7 
D{m x 07) = A x (x O 7, y o 7) 

%°7) = A v {xo 7,y 07) 

Construct a vector field on R 5 corresponding to this system of differen- 
tial equations. Derive the first few terms in the series solution of this 
problem by exponentiation. 


3.4 One-Form Fields 

A vector field that gives a velocity for each point on a topographic 
map of the surface of the Earth can be applied to a function, such 
as one that gives the height for each point on the topographic 
map, or a map that gives the temperature for each point. The 
vector field then provides the rate of change of the height or tem- 
perature as one moves in the way described by the vector field. 
Alternatively, we can think of a topographic map, which gives the 
height at each point, as measuring a velocity field at each point. 
For example, we may be interested in the velocity of the wind or 
the trajectories of migrating birds. The topographic map gives 
the rate of change of height at each point for each velocity vec- 
tor field. The rate of change of height can be thought of as the 
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number of equally-spaced (in height) contours that are pierced by 
each velocity vector in the vector field. 

Differential of a Function 

For example, consider the differential 9 df of a manifold function 
f, defined as follows. If df is applied to a vector field v we obtain 

df(v) = v(f), (3.34) 

which is a function of a manifold point. 

The differential of the height function on the topographic map is 
a function that gives the rate of change of height at each point for 
a velocity vector field. This gives the same answer as the velocity 
vector field applied to the height function. 

The differential of a function is linear in the vector fields. The 
differential is also a linear operator on functions: if fi and f 2 are 
manifold functions, and if c is a real constant, then 

d(fr + f 2 ) = dfi + df 2 

and 

d(cf) = cdf. 

Note that c is not a manifold function. 

One- Form Fields 

A one-form field is a generalization of this idea; it is something 
that measures a vector field at each point. 

One-form fields are linear functions of vector fields that produce 
real-valued functions on the manifold. A one-form field is linear 
in vector fields: if a; is a one-form field, v and w are vector fields, 
and c is a manifold function, then 

u?(v + w) = oj(v) + uj(w) (3.35) 

and 

w(cv) = cuj(v). (3.36) 


9 The differential of a manifold function will turn out to be a special case of 
the exterior derivative, which will be introduced later. 
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Sums and scalar products of one-form fields on a manifold have 
the following properties. If u and 6 are one-form fields, and if f 
is a real-valued manifold function, then: 

(u> + 0)(v) = u>(v) + 0(v), 

(f u>)(v) = fw(v). 

3.5 Coordinate-Basis One-Form Fields 

Given a coordinate function x, we define the coordinate-basis one- 
form fields X* by 

X*(v)(m) = v(x*)(m) (3.39) 

or collectively 

X(v)(m) = v(x)(m). (3.40) 

With this definition the coordinate-basis one-form fields are dual 
to the coordinate-basis vector fields in the following sense (see 
equation 3. 15): 10 

X*(X i )(m) = Xj(x*)(m) = dj(x* o x _1 )(x(m)) = 5). (3.41) 

The tuple of basis one- form fields X(v)(m) is an up structure like 
that of x- 

The general one- form held u is a linear combination of coordinate- 
basis one-form fields: 

w(v)(m) = o(x(m)) X(v)(m) = Oi(x(m)) X*(v)(m), (3.42) 

i 

with coefficient-function tuple a(x), for x = x( m )- We can write 
this more simply as 

cj(v) = (ao x) X(v), (3.43) 

because everything is evaluated at m. 


(3.37) 

(3.38) 


10 The Kronecker delta 8) is one if i == j and zero otherwise. 



3.5 Coordinate- Basis One-Form. Fields 


35 


The coefficient tuple can be recovered from the one- form field: 11 

ai(x) = ^(X l )(x _1 (a:)). (3.44) 

This follows from the dual relationship (3.41). We can see this as 
a program: 12 

(define omega 

( component s-> If orm-field 
(down (literal-function ’a_0 R2->R) 

(literal-function ’a_l R2->R) ) 

R2-rect) ) 

((omega (down d/dx d/dy)) R2-rect -point) 

(down (a.O (up xO yO ) ) ( a.l (up xO yO ) ) ) 

We provide a shortcut for this construction: 

(define omega (literal-lf orm-field ’a R2-rect)) 

A differential can be expanded in a coordinate basis: 
df(v) = jX*(v). (3.45) 

i 

The coefficients c,; = df(X*) = X;(f) = <9i(fox -1 )°X are the partial 
derivatives of the coordinate representation of f in the coordinate 
system of the basis: 

(((d (literal-manifold-function ’f-rect R2-rect)) 
(coordinate-system->vector-basis R2-rect) ) 

R2-rect-point) 

(down (((partial 0) f-rect) (up xO yO ) ) 

(( (partial 1) f-rect) (up xO yO))) 

However, if the coordinate system of the basis differs from the 
coordinates of the representation of the function, the result is 
complicated by the chain rule: 


11 The analogous recovery of coefficient tuples from vector fields is equa- 
tion (3.3): &* jV = v (x l ) ° X" 1 - 

12 The procedure components->lf orm-field is analogous to the procedure 
components->vector-f ield introduced earlier. 
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( ( (d (literal -manifold-function ’f-polar R2-polar)) 
(coordinate-system->vector-basis R2-rect) ) 

((point R2-polar) (up ’r ’theta))) 

(down (- (* (((partial 0) f-polar) (up r theta)) (cos theta)) 
(/ (* (((partial 1) f-polar) (up r theta)) 

( sin theta ) ) 

r) ) 

( + (* (((partial 0) f-polar) (up r theta)) (sin theta)) 
(/ (* (((partial 1) f-polar) (up r theta)) 

( cos theta ) ) 

r))) 

The coordinate-basis one-form fields can be used to find the 
coefficients of vector fields in the corresponding coordinate vector- 


field basis: 

X*(v) = v(x*) = b l o x (3.46) 

or collectively, 

X(v) =v(x) = bo X . (3.47) 

A coordinate-basis one- form field is often written dx*. This 
traditional notation for the coordinate-basis one- form fields is jus- 
tified by the relation: 

dx* = X* = d(x*). (3.48) 


The def ine-coordinates procedure also makes the basis one- 
form fields with these traditional names inherited from the coor- 
dinates. 

We can illlustrate the duality of the coordinate-basis vector 
fields and the coordinate-basis one- form Helds: 

(def ine-coordinates (up x y) R2-rect) 

((dx d/dy) R2-rect -point) 

0 

((dx d/dx) R2-rect -point) 

1 


We can use the coordinate-basis one-form fields to extract the 
coefficients of circular on the rectangular vector basis: 
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((dx circular) R2-rect-point) 
(* -1 yO) 

((dy circular) R2-rect-point) 
xO 


But we can also find the coefficients on the polar vector basis: 

((dr circular) R2-rect-point) 

0 

((dtheta circular) R2-rect -point) 

1 

So circular is the same as d/dtheta, as we can see by applying 
them both to the general function f : 

(define f (literal -manifold-function ’f-rect R2-rect)) 

(((- circular d/dtheta) f) R2-rect-point) 

0 

Not All One-Form Fields Are Differentials 

Although all one-form fields can be constructed as linear combi- 
nations of basis one-form fields, not all one-form fields are differ- 
entials of functions. 

The coefficients of a differential are (see equation 3.45): 


C; = Xj (f ) = df(Xi) (3.49) 

and partial derivatives of functions commute 

Xj(Xy(f)) = Xj(X;(f)). (3.50) 

As a consequence, the coefficients of a differential are constrained 

Xi(c j) = Xj(a), (3.51) 

but a one-form field can be constructed with arbitrary coefficient 
functions. For example: 

xdx + xdy (3.52) 


is not a differential of any function. This is why we started with 
the basis one-form fields and built the general one-form fields in 
terms of them. 
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Coordinate Transformations 

Consider a coordinate change from the chart x to the chart x' ■ 

X(v) = v(x) 

= v (x° (x') _1 °x') 

= (D(x° (x0 _1 ) °x0 v(x') 

= {D(x o (x') _1 ) ° x') X'(v), (3.53) 

where the third line follows from the chain rule for vector fields. 
One-form fields are independent of coordinates. So, 

w(v) = (a o x) X(v) = (a' o x') X'(v). (3.54) 

Eqs. (3.54) and (3.53) require that the coefficients transform under 
coordinate transformations as follows: 

a(x(m)) D( X o (x') _1 )(x'(m)) = a'(x'(m)), (3.55) 

or 

«(x(m)) = a'(x'(m)) (D( X o (x , )" 1 )(x / (m))) _1 - (3.56) 

The coefficient tuple a(x) is a down structure compatible for 
contraction with b(x). Let v be the vector with coefficient tuple 
b(x), and u be the one-form with coefficient tuple a(x). Then, by 
equation (3.43), 

w(v) = (ao X ) (6o X ). (3.57) 

As a program: 

(define omega (literal-lf orm-f ield ’a R2-rect)) 

(define v (literal-vector-field ; b R2-rect)) 

((omega v) R2-rect -point) 

(+ (* (b~0 (up x y) ) (a.O (up xO yO ) ) ) 

(* (b~l (up x y ) ) (a.l (up xO yO ) ) ) ) 

Comparing equation (3.56) with equation (3.23) we see that 
one-form components and vector components transform oppo- 
sitely, so that 


a(x) b{x) = a'(x') b'(x'), (3.58) 

as expected because u>(v)(m) is independent of coordinates. 
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Exercise 3.2: Verification 

Verify that the coefficients of a one-form field transform as described in 
equation (3.56). You should use equation (3.44) in your derivation. 

Exercise 3.3: Hill Climbing 

The topography of a region on the Earth can be specified by a manifold 
function h that gives the altitude at each point on the manifold. Let 
v be a vector field on the manifold, perhaps specifying a direction and 
rate of walking at every point on the manifold. 

a. Form an expression that gives the power that must be expended to 
follow the vector field at each point. 

b. Write this as a computational expression. 
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Basis Fields 


A vector field may be written as a linear combination of basis 
vector fields. If n is the dimension, then any set of n linearly 
independent vector fields may be used as a basis. The coordinate 
basis X is an example of a basis. 1 We will see later that not every 
basis is a coordinate basis: in order to be a coordinate basis, 
there must be a coordinate system such that each basis element is 
the directional derivative operator in a corresponding coordinate 
direction. 

Let e be a tuple of basis vector fields, such as the coordinate 
basis X. The general vector field v applied to an arbitrary manifold 
function f can be expressed as a linear combination 

v(f)(m) = e(f)(m) b(m) = X^( f )( m ) b*(m), (4.1) 

i 

where b is a tuple- valued coefficient function on the manifold. 
When expressed in a coordinate basis, the coefficients that specify 
the direction of the vector are naturally expressed as functions 
b l of the coordinates of the manifold point. Here, the coefficient 
function b is more naturally expressed as a tuple- valued function 
on the manifold. If b is the coefficient function expressed as a 
function of coordinates, then b = b o x is the coefficient function 
as a function on the manifold. 

The coordinate-basis forms have a simple definition in terms of 
the coordinate-basis vectors and the coordinates (equation 3.40). 
With this choice, the dual property, equation (3.41), holds without 
further fuss. More generally, we can define a basis of one- forms e 
that is dual to e in that the property 

e' •! e , ) ( m ) 4' (4.2) 

is satisfied, analogous to property (3.41). Figure 4.1 illustrates 
the duality of basis fields. 


1 We cannot say if the basis vectors are orthogonal or normalized until we 
introduce a metric. 



42 


Chapter 4 Basis Fields 



Figure 4.1 Let arrows eo and ei depict the vectors of a basis vector 
field at a particular point. Then the foliations shown by the parallel 
fines depict the dual basis one-form fields at that point. The dotted 
lines represent the field e° and the dashed lines represent the field e 1 . 
The spacings of the lines are 1/3 unit. That the vectors pierce three 
of the lines representing their duals and do not pierce any of the lines 
representing the other basis elements is one way to see the relationship 
^(ejXm) = 5 ). 

To solve for the dual basis e given the basis e, we express the 
basis vectors e in terms of a coordinate basis 2 

ei(f) = E X *( f ) c l> ( 43 ) 

k 

and the dual one- forms e in terms of the dual coordinate one- forms 

e‘(v)=£d‘X'(v), (4.4) 

l 


2 We write the vector components on the right and the tuple of basis vectors 
on the left because if we think of the basis vectors as organized as a row and 
the components as organized as a column then the formula is just a matrix 
multiplication. 
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then 

e*( ej ) = ^djX'fe,) 

l 

l 

= E d i E x hx')4 

l k 

= E d i44 

kl 

= E d i c J- < 4 - 5 ) 

k 

Applying this at m we get 

e*(ej)(m) = ^ d^(m)cj(m). (4.6) 

k 

So the d coefficients can be determined from the c coefficents (es- 
sentially by matrix inversion). 

A set of vector fields {e^ } may be linearly independent in the 
sense that a weighted sum of them may not be identically zero over 
a region, yet it may not be a basis in that region. The problem is 
that there may be some places in the region where the vectors are 
not independent. For example, two of the vectors may be parallel 
at a point but not parallel elsewhere in the region. At such a point 
m the determinant of the matrix c(m) is zero. So at these points 
we cannot define the dual basis forms . 3 

The dual form fields can be used to determine the coefficients b 
of a vector field v relative to a basis e, by applying the dual basis 
form fields e to the vector held. Let 

v(f) = X>(f)bL (4.7) 

i 

Then 

e'(v) = b J . (4.8) 

3 This is why the set of vector fields and the set of one-form fields are modules 
rather than vector spaces. 
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Define two general vector fields: 

(define eO 

(+ (* (literal-manifold-function ’eOx R2-rect) d/dx) 

(* (literal-manifold-function ’eOy R2-rect) d/dy) ) ) 

(define el 

(+ (* (literal-manifold-function ’elx R2-rect) d/dx) 

(* (literal-manifold-function ’ely R2-rect) d/dy))) 

We use these as a vector basis and compute the dual: 

(define e-vector-basis (down eO el)) 

(define e-dual-basis 

(vector-basis->dual e-vector-basis R2-polar)) 

The procedure vector-basis->dual requires an auxiliary coordi- 
nate system (here R2-polar) to get the c k j coefficient functions 
from which we compute the dl coefficient functions. However, 
the final result is independent of this coordinate system. Then 
we can verify that the bases e and e satisfy the dual relationship 
(equation 3.41) by applying the dual basis to the vector basis: 

((e-dual-basis e-vector-basis) R2-rect-point) 

(up ( down 10) ( down 0 1 ) ) 

Note that the dual basis was computed relative to the polar coor- 
dinate system: the resulting objects are independent of the coor- 
dinates in which they were expressed! 

Or we can make a general vector field with this basis and then 
pick out the coefficients by applying the dual basis: 

(define v 

(* (up (literal -manifold-function ’b~0 R2-rect) 

(literal -manifold-function ’b~l R2-rect)) 
e-vector-basis) ) 

((e-dual-basis v) R2-rect -point) 

(up (b~0 (up xO yO)) (b~l (up xO yO ) ) ) 


4.1 Change of Basis 

Suppose that we have a vector field v expressed in terms of one 
basis e and we want to reexpress it in terms of another basis e' . 
We have 




4-1 Change of Basis 

v(f) = ^e i (f)b* = ^e'(f)b' i . (4.9) 

* 3 

The coefficients b 7 can be obtained from v by applying the dual 
basis 

b ,j = e' j (v) = ^Te'-'fejb 7 . (4.10) 

i 

Let 

4 = &M, (4-11) 

then 

b'-' ^J'b'. (4.12) 

i 

and 

ei( f ) = 5^ e i( f ) J i- ( 4 - 13 ) 

3 

The Jacobian J is a structure of manifold functions. Using tuple 
arithmetic, we can write 

b 7 = Jb (4.14) 

and 

e(f) = e'(f)J. (4.15) 


We can write 

(define (Jacobian to-basis from-basis) 

(s:map/r (basis->lf orm-basis to-basis) 

(basis->vector-basis from-basis) ) ) 

These are the rectangular components of a vector field: 

(define b-rect 

( (coordinate-system->lf orm-basis R2-rect) 
(literal-vector-field ’b R2-rect))) 


The polar components are: 
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(define b-polar 

(* (Jacobian (coordinate-system->basis R2-polar) 
(coordinate-system->basis R2-rect) ) 

b-rect) ) 

(b-polar ((point R2-rect) (up ’xO ’yO))) 

(up 

(/ (+ ( * xO (b~0 (up xO yO ) ) ) ( * yO (b~l (up xO yO)))) 

(sqrt (+ (expt xO 2 ) (expt yO 2)))) 

(/ (+ ( * xO (b~l (up xO yO))) (* -1 yO (b~0 (up xO yO)))) 

(+ ( expt xO 2) ( expt yO 2) ) ) ) 

We can also get the polar components directly: 

( ( (coordinate-system->lf orm-basis R2-polar) 
(literal-vector-field ’b R2-rect)) 

((point R2-rect) (up ’xO ’yO))) 

(up 

(/ (+ ( * xO (b~0 (up xO yO))) ( * yO (b~l (up xO yO)))) 

(sqrt (+ (expt xO 2) (expt yO 2)))) 

(/ (+ ( * xO (b~l (up xO yO))) (* -1 yO (b~0 (up xO yO)))) 

(+ (expt xO 2) (expt yO 2)))) 

We see that they are the same. 

If K is the Jacobian that relates the basis vectors in the other 
direction 

e'(f) = e(f)K (4.16) 

then 

KJ = I = JK (4.17) 

where I is a manifold function that returns the multiplicative iden- 
tity. 

The dual basis transforms oppositely. Let 

w = 5Z = J2 a '£ n - ( 4 - 18 ) 

i i 
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The coefficients are 4 

a* = w(ei) = ^ a'e'^e,) = ]T a'T, 

j j 

or, in tuple arithmetic, 
a = a'J. 

Because of equation (4.18) we can deduce 
e = Ke 7 . 

4.2 Rotation Basis 

One interesting basis for rotations in 3-dimensional space is not a 
coordinate basis. 

Rotations are the actions of the special orthogonal group SO(3), 
which is a 3-dimensional manifold. The elements of this group 
may be represented by the set of 3 x 3 orthogonal matrices with 
determinant +1. 

We can use a coordinate patch on this manifold with Euler angle 
coordinates: each element has three coordinates, 8, cj), ip. A mani- 
fold point may be represented by a rotation matrix. The rotation 
matrix for Euler angles is a product of three simple rotations: 

= R z ((J))R x (0)R z ('il>) 1 where R x and R z are functions 
that take an angle and produce the matrices representing rota- 
tions about the x and z axes, respectively. We can visualize 6 as 
the colatitude of the pole from the z-axis, (j) as the longitude, and 
if) as the rotation around the pole. 

Given a rotation specified by Euler angles, how do we change 
the Euler angle to correspond to an incremental rotation of size 
e about the x-axis? The direction (a, b , c) is constrained by the 
equation 

R x (e)M(9,(j ) ,'ilj ) = M(Q + oe, (f> + be, ij) + ce). (4.22) 


(4.19) 

(4.20) 

(4.21) 


4 We see from equations (4.15) and (4.16) that J and K are inverses. We can 
obtain their coefficients by: = e' J (ei) and K? = e-'(e'). 
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Linear equations for (a, b, c ) can be found by taking the derivative 
of this equation with respect to e. We find 


0 = ccos# + 6, (4.23) 

0 = a sin cj> — c cos cj> sin 0. (4.24) 

1 = csin</>sin# + acos(j), (4.25) 

with the solution 


a = cos 4>, 

sin cf) cos 6 

b= — ’ 

sint/ 

sin cj> 

c = -. 

sin# 


(4.26) 

(4.27) 

(4.28) 


Therefore, we can write the basis vector field that takes directional 
derivatives in the direction of incremental x rotations as 


8 8 8 
= a d9 + % + C W 

d sin</>cos$ d sin 0 d 
86 sin 6 dcj)^ sin 9 8 'ip 


(4.29) 


Similarly, vector fields for the incremental y and z rotations are 


cos 4> cos 6 8 . 8 cos 0 8 

By sin 6 dcj) S 86 sin 6 dif) ’ 

8 

&z ~ 8cj ) ' 


(4.30) 

(4.31) 


4.3 Commutators 

The commutator of two vector fields is defined as 

[v, w](f) = v(w(f)) — w(v(f)). (4.32) 

In the special case that the two vector fields are coordinate basis 
fields, the commutator is zero: 
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[Xj,Xj](f) = Xj(Xj(f)) — Xj(Xj(f)) 

= didji f O X -1 ) ox- djdi ( f o X -1 ) o x 
= 0, (4.33) 

because the individual partial derivatives commute. The vanishing 
commutator is telling us that we get to the same manifold point by 
integrating from a point along first one basis vector held and then 
another as from integrating in the other order. If the commutator 
is zero we can use the integral curves of the basis vector fields to 
form a coordinate mesh. 

More generally, the commutator of two vector fields is a vector 
held. Let v be a vector held with coefficient function c = cox, 
and u be a vector held with coefficient function b = b o x, both 
with respect to the coordinate basis X. Then 

[u, v](f) = u(v(f)) - v(u(f)) 

i j 

= E X i(E X *( f )c> J ^XdVXxfdhic' 

j i i j 

= ^[X i ,X,](f)cV 

ij 

+ I] X i: (f) 5^( x j( c< )Ij j ' - Xj(bV) 

i j 

= E x ' t ( f ) a *’ ( 4 - 34 ) 

i 

where the coefficient function a of the commutator vector held is 

a' ViX ;i c')ffi - X ; ;b ; icO 

j 

= u(c*) -v(b'). (4.35) 

We used the fact, shown above, that the commutator of two co- 
ordinate basis helds is zero. 
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We can check this formula for the commutator for the general 
vector fields eO and el in polar coordinates: 

(let* ((polar-basis (coordinate-system- >basis R2-polar)) 

(polar-vector-basis (basis->vector-basis polar-basis)) 
(polar-dual-basis (basis->lf orm-basis polar-basis)) 

(f (literal -manifold-function ’ f-rect R2-rect))) 

((- ((commutator eO el) f) 

(* (- (eO (polar-dual-basis el)) 

(el (polar-dual-basis eO))) 

(polar-vector-basis f))) 

R2-rect-point) ) 

0 


Let e be a tuple of basis vector fields. The commutator of two 
basis fields can be expressed in terms of the basis vector fields: 

[e*.ej](f) = 53 d S e fc( f )> ( 4 - 36 ) 

k 

where d- are functions of m, called the structure constants for the 

L J 

basis vector fields. The coefficients are 

d ij =e fc ([e i ,e i ]). (4.37) 

The commutator [u,v] with respect to a non-coordinate basis e, 
is 

M(f) = £>(f) Uc‘) - v(b*) + E c ' b '4 ) • ( 4 -38) 

k \ ij / 

Define the vector fields Jx, Jy, and Jz that generate rotations 
about the three rectangular axes in three dimensions : 5 

(define Jz (- (* x d/dy) (* y d/dx))) 

(define Jx (- (* y d/dz) (* z d/dy))) 

(define Jy (- (* z d/dx) (* x d/dz))) 


5 Using 

(define R3-rect (coordinate-system-at ’rectangular ’origin R3)) 
(def ine-coordinates (up x y z) R3-rect) 

(define R3-rect-point ((point R3-rect) (up ’xO ’yO ’zO))) 
(define g (literal -manifold-function ’g-rect R3-rect)) 
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(((+ (commutator Jx Jy) Jz) g) R3-rect -point) 

0 

(((+ (commutator Jy Jz) Jx) g) R3-rect -point) 

0 

(((+ (commutator Jz Jx) Jy) g) R3-rect -point) 

0 

We see that 

[Jx) J y\ — Jz 

[Jy i Jz] — Jx 

[Jz,Jx] = ~Jy. (4.39) 

We can also compute the commutators for the basis vector fields 
e x , e y , and e z in the SO(3) manifold (see equations 4.29-4.31) that 
correspond to rotations about the x, y, and z axes, respectively: 6 

(((+ (commutator e_x e_y) e_z) f) S03-point) 

0 

(((+ (commutator e_y e_z) e_x) f) S03-point) 

0 

(((+ (commutator e_z e_x) e_y) f) S03-point) 

0 

You can tell if a set of basis vector fields is a coordinate basis by 
calculating the commutators. If they are nonzero, then the basis 
is not a coordinate basis. If they are zero then the basis vector 
fields can be integrated to give the coordinate system. 

Recall equation (3.31) 


( e * v f)(m) = (f o$)(m). 

(4.40) 

Iterating this equation, we find 


( e ™e*f)(m) = (fo#oOm). 

(4.41) 


6 Using 

(define Euler- angles (coordinate-system-at ’Euler ’Euler-patch S03)) 
(define Euler-angles-chi-inverse (point Euler-angles) ) 

(def ine-coordinates (up theta phi psi) Euler-angles) 

(define S03-point ((point Euler-angles) (up ’theta ’phi ’psi))) 
(define f (literal -manifold-function ’f-Euler Euler-angles)) 
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Notice that the evolution under w occurs before the evolution 
under v. 

To illustrate the meaning of the commutator, consider the evo- 
lution around a small loop with sides made from the integral 
curves of two vector fields v and w. We will first follow v, then w, 
then —v, and then — w: 

(e e V w e- £V e- ew f)(m). (4.42) 

To second order in e the result is 7 

(e e2[v ’ wl f)(m). (4.43) 

This result is illustrated in figure 4.2. 

Take a point 0 in M as the origin. Then, presuming [e , , e ; ] = 0, 
the coordinates x of the point m in the coordinate system corre- 
sponding to the e basis satisfy 8 

m = <t>T (0) = (4.44) 

where x is the coordinate function being defined. Because the 

elements of e commute, we can translate separately along the in- 
tegral curves in any order and reach the same point; the terms 
in the exponential can be factored into separate exponentials if 
needed. 

Exercise 4.1: Alternate Angles 

Note that the Euler angles are singular at 9 = 0 (where <f> and if) become 
degenerate), so the representations of e x , e y , and e z (defined in equa- 


For non-commuting operators A and B, 


A B -A -B 

e e e e 


A 2 

= (1 + A + t + 


1 +B +? +•■ •• 


x ( 1 — A + 4 - + • • ^ ( 1 - B + + • • • 


— 1 + [ A , B ] -f- • • • , 

to second order in A and B. All higher-order terms can be written in terms 
of higher-order commutators of A and B. An example of a higher-order com- 
mutator is [A, [A, B]\. 


s Hcre x is an up-tuple structure of components, and e is down-tuple structure 
of basis vectors. The product of the two contracts to make a scaled vector, 
along which we translate by one unit. 
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Figure 4.2 The commutator of two vector fields computes the residual 
of a small loop following their integral curves. 


tions 4.29-4.31) have problems there. An alternate coordinate system 
avoids this problem, while introducing a similar problem elsewhere in 
the manifold. 

Consider the “alternate angles” (O a ,<j> a ,il> a ) which define a rotation 
matrix via M (0 a , <j> a , V'a) = R z (</>a) R x {Qa) Ry (V’q)- 

a. Where does the singularity appear in these alternate coordinates? 
Do you think you could define a coordinate system for rotations that 
has no singularities? 

b. What do the e x . e y , and e 2 basis vector fields look like in this coor- 
dinate system? 

Exercise 4.2: General Commutators 

Verify equation (4.38). 

Exercise 4.3: SO (3) Basis and Angular Momentum Basis 

How are J x , J y , and J 2 related to e x , e y , and e 2 in equations (4.29-4.31)? 
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We know how to integrate real-valued functions of a real variable. 
We want to extend this idea to manifolds, in such a way that the 
integral is independent of the coordinate system used to compute 
it. 

The integral of a real-valued function of a real variable is the 
limit of a sum of products of the values of the function on subinter- 
vals and the lengths of the increments of the independent variable 
in those subintervals: 

J f = J f{x) dx = o f ( x i ) Ax i • (5-1) 


If we change variables ( x = g(y)), then the form of the integral 
changes: 



f(x) dx 

1 


h-Ha) 
ra _ 1 0 ) 


f(g(y))Dg{y)dy 


(fog)Dg. 


(5.2) 


We can make a coordinate-independent notion of integration in 
the following way. An interval of the real line is a 1-dimensional 
manifold with boundary. We can assign a coordinate chart x t° 
this manifold. Let x = x( m )- The coordinate basis is associated 
with a coordinate-basis vector field, here d/dx. Let u be a one- 
form on this manifold. The application of lj to d/dx is a real- 
valued function on the manifold. If we compose this with the 
inverse chart, we get a real-valued function of a real variable. We 
can then write the usual integral of this function 


I = 


u(d/dx)o X 


-l 


(5.3) 
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It turns out that the value of this integral is independent of the 
coordinate chart used in its definition. Consider a different coor- 
dinate chart x' = x^m), with associated basis vector field d/dx! . 
Let g = x' ° X -1 - We have 


r b‘ 
J a' 


w {d/dx!) O x '- 1 

= / LL){d/dx{D{x °x M ) o x'))°x ,_1 

(< x>{d/dx)D (x o x'- 1 ) O x') o x /_1 

(< u>(d/dx ) o X /_1 ) D {x° x' 1 ) 

= I {{(u{d/dx) ox^ 1 ) D (xox'- 1 )) ° g) Dg 
J a 

= j ui{d/ dx) o x _1 , 


(5.4) 


where we have used the rule for coordinate transformations of 
basis vectors (equation 3.19), linearity of forms in the first two 
lines, and the rule for change-of- variables under an integral in the 
last line. 1 

Because the integral is independent of the coordinate chart, we 
can write simply 


1= [ <*>, (5.5) 

Jm 

where M is the 1-dimensional manifold with boundary correspond- 
ing to the interval. 

We are exploiting the fact that coordinate basis vectors in dif- 
ferent coordinate systems are related by a Jacobian (see equa- 
tion 3.19), which cancels the Jacobian that appears in the change- 
of- variables formula for integration (see equation 5.2). 


1 Note (D (x ° x! *) ° (x' ° X 1 )) D{x! ° X *) = 1- With g —x! °X 1 this is 
(D(g x ) o g) ( Dg ) = 1. 
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5.1 Higher Dimensions 


We have seen that we can integrate one-forms on 1-dimensional 
manifolds. We need higher-rank forms that we can integrate on 
higher-dimensional manifolds in a coordinate-independent man- 
ner. 

Consider the integral of a real-valued function, f : R n — >• R, over 
a region U in R n . Under a coordinate transformation g : R n — >• R ra , 
we have 2 


[ f = [ (fog) del ( Dg ) . (5.6) 

j U Jg-H U) 

A rank n form field takes n vector field arguments and produces 
a real-valued manifold function: oj (v, w, . . . , u) (m). By analogy 
with the 1-dimensional case, higher-rank forms are linear in each 
argument. Higher-rank forms must also be antisymmetric under 
interchange of any two arguments in order to make a coordinate- 
free definition of integration analogous to equation (5.3). 

Consider an integral in the coordinate system X : 


/ w(X 0 ,X 1 ,...)ox h (5.7) 

W(U) 

Under coordinate transformations g = X ° x' 1 > the integral be- 
comes 

[ u> (X 0 , Xi, . . .) o x'^ 1 det (Dg ) . (5.8) 

•M U) 

Using the change-of-basis formula, equation (3.19): 

X(f) = X'(f )(D( X ' o x- 1 )) o x = X'(f) (D (g- 1 )) o X . (5.9) 

If we let M = ( D ( g -1 )) o x then 

(w (X 0 ,X 1 ,...)ox ,_1 ) det (Dg) 

= (w (X'M 0 , X 1 Mi , . . .) o x'" 1 ) det (Dg) 

= (u (X' 0 , Xi, . . .) o x'" 1 ) a (M 0 , Mi, . . .) det (Dg ) , (5.10) 


2 The determinant is the unique function of the rows of its argument that i) is 
linear in each row, ii) changes sign under any interchange of rows, and iii) is 
one when applied to the identity multiplier. 
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using the multilinearity of u), where M % is the i th column of M. 
The function a is multilinear in the columns of M. To make a 
coordinate-independent integration we want the expression (5.10) 
to be the same as the integrand in 

I'=[ w(X' I X',...)o X '- 1 . (5.11) 

•M u) 

For this to be the case, a (Mo, M \, . . .) must be (det (Dg))^ 1 = 
det(M). So a is an antisymmetric function, and thus so is u>. 

Thus higher-rank form fields must be antisymmetric multilinear 
functions from vector fields to manifold functions. So we have a 
coordinate-independent definition of integration of form fields on 
a manifold and we can write 

1 = 1' = f w. (5.12) 

J u 

Wedge Product 

There are several ways we can construct antisymmetric higher- 
rank forms. Given two one- form fields u and r we can form a 
two-form field u> A r as follows: 

(u> A r)(v, w) = u>(v)t( w) — oj(w)t(v). (5.13) 

More generally we can form the wedge of higher-rank forms. Let 
w be a Ai-form field and r be an /-form held. We can form a 
(A; + Z)-form held wAras follows: 

u> A r = - - Alt (a; <S> r) (5.14) 

k\ /! 

where, if rj is a function on m vectors, 

Alt(r?)(v 0 , . . . , v m _i ) 

= — Parity (cr) t? (v^o),...,^^-!)), (5.15) 

TTL . 

crGPerm(m) 

and where 

0 r(v 0 , . . . , v fc _i, v fc , . . . , v fc+; _!) 

= cj(v 0 , . . . , v fc _i)r(v fc , . . . , v fc+ ;_i). 


(5.16) 
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Figure 5.1 The area of the parallelogram in the ( x,y ) coordinate 
plane is given by A (u, v) (m). 

The wedge product is associative, and thus we need not specify 
the order of a multiple application. The factorial coefficients of 
these formulas are chosen so that 

(dx A dy A . . .) (d/dx, d/dy , . . .) = 1. (5.17) 

This is true independent of the coordinate system. 

Equation (5.17) gives us 

/ dx A dy A . . . = Volume(U) (5.18) 

J U 

where Volume(U) is the ordinary volume of the region correspond- 
ing to U in the Euclidean space of R" with the orthonormal coor- 
dinate system ( x , y, ...). 3 

An example two-form (see figure 5.1) is the oriented area of 
a parallelogram in the ( x , y ) coordinate plane at the point m 
spanned by two vectors u = u°<9/<9x + u^d/dy and v = \i°d/dx + 
v 1 d/dy , which is given by 

A (u, v) (m) = u° (m) v 1 (m) — v° (m) u 1 (m) . (5.19) 


3 By using the word “orthonormal” here we are assuming that the range of 
the coordinate chart is an ordinary Euclidean space with the usual Euclidean 
metric. The coordinate basis in that chart is orthonormal. Under these con- 
ditions we can usefully use words like “length,” “area,” and “volume” in the 
coordinate space. 
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Note that this is the area of the parallelogram in the coordinate 
plane, which is the range of the coordinate function. It is not the 
area on the manifold. To define that, we need more structure — the 
metric. We will put a metric on the manifold in Chapter 9. 

3-Dimensional Euclidean Space 

Let’s specialize to 3-dimensional Euclidean space. Following equa- 
tion (5.18) we can write the coordinate-area two-form in another 
way: A = dx A dy. As code: 

(def ine-coordinates (up x y z) R3-rect) 

(define u (+ (* ’u~0 d/dx) (* ’u"l d/dy))) 

(define v (+ (* ’v~0 d/dx) (* ’v"l d/dy))) 

(((wedge dx dy) u v) R3-rect-point) 

(+ (* u~0 v~l) (* -1 u~l v~0)) 

If we use cylindrical coordinates and define cylindrical vector 
fields we get the analogous answer in cylindrical coordinates: 

(def ine-coordinates (up r theta z) R3-cyl) 

(define a (+ (* ’ a~0 d/dr) (* ’ a~l d/dtheta))) 

(define b (+ (* ; b~0 d/dr) (* ’b"l d/dtheta))) 

(((wedge dr dtheta) a b) ((point R3-cyl) (up ’rO ’thetaO ’zO))) 
(+ (* a~0 b~l) (* -1 a ~1 b~0 ) ) 

The moral of this story is that this is the area of the parallelogram 
in the coordinate plane. It is not the area on the manifold! 

There is a similar story with volumes. The wedge product of the 
elements of the coordinate basis is a three-form that measures our 
usual idea of coordinate volumes in R 3 with a Euclidean metric: 

(define u (+ (* ’u~0 d/dx) (* ’u"l d/dy) (* ’u"2 d/dz))) 

(define v (+ (* ’v~0 d/dx) (* ’v"l d/dy) (* ’v"2 d/dz))) 

(define w (+ (* ’w~0 d/dx) (* ’w"l d/dy) (* ’w"2 d/dz))) 
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(((wedge dx dy dz) u v w) R3-rect -point) 

(+ ( * u~0 v'l w~2 ) 

( * -1 u~0 v~2 w~l ) 

( * -1 u~l v~0 w~2 ) 

( * u~l v~2 w~0 ) 

( * u~2 v~0 w~l ) 

( * -1 u~2 v~l w~0 ) ) 

This last expression is the determinant of a 3 x 3 matrix: 

(- (((wedge dx dy dz) u v w) R3-rect-point) 

(determinant 

(matrix-by-rows (list ’u"0 ’u~l ’u"2) 

(list ’v~0 ’v"l ’v"2) 

(list ’w"0 ’w"l ’w"2)))) 

0 

If we did the same operations in cylindrical coordinates we would 
get the analogous formula, showing that what we are computing 
is volume in the coordinate space, not volume on the manifold. 

Because of antisymmetry, if the rank of a form is greater than 
the dimension of the manifold then the form is identically zero. 
The /c-forms on an n-dimensional manifold form a module of di- 
mension (^). We can write a coordinate-basis expression for a 
A:- form as 

n 

LL> = ^2 dx*° A . . . Adx 1 "- 1 . (5.20) 

io,...,ik- 1=0 

The antisymmetry of the wedge product implies that 

U iamr-Mh-D = p arity(a)w io> ... )ifc _i, (5.21) 

from which we see that there are only ()() independent components 
of Ld. 

Exercise 5.1: Wedge Product 

Pick a coordinate system and use the computer to verify that 

a. the wedge product is associative for forms in your coordinate system; 

b. formula (5.17) is true in your coordinate system. 
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5.2 Exterior Derivative 


The intention of introducing the exterior derivative is to capture 
all of the classical theorems of “vector analysis” into one unified 
Stokes’s Theorem, which asserts that the integral of a form on the 
boundary of a manifold is the integral of the exterior derivative of 
the form on the interior of the manifold: 4 



(5.22) 


As we have seen in equation (3.34), the differential of a function 
on a manifold is a one-form field. If a function on a manifold is 
considered to be a form field of rank zero, 5 then the differential 
operator increases the rank of the form by one. We can generalize 
this to fe-form fields with the exterior derivative operation. 

Consider a one- form uj. We define 6 


dw(vi,v 2 ) = vi(u>(v 2 )) - v 2 (w(vi)) - u>([vi, v 2 ]). (5.23) 

More generally, the exterior derivative of a fc-form field is a k + 1- 
form field, given by: 7 

du>(v 0 , ... . ,v fe ) = (5.24) 

k 

X { (( 1 )* v i(^( v 0j • • • , Vj_i, Vj+1, . . . , V fc )) + 

i=0 

k 

X ( _i r h? w (h> Vj], v 0 , . . . , Vj_i, Vj+ 1 , . . . , Vj- 1 , V J+ 1 , . . . , v fc ))} . 

j=i + 1 

This formula is coordinate-system independent. This is the way 
we compute the exterior derivative in our software. 


4 This is a generalization of the Fundamental Theorem of Calculus. 
5 A manifold function f induces a form held f of rank 0 as follows: 
f()(m) = f(m). 

6 The dehnition is chosen to make Stokes’s Theorem pretty. 

7 See Spivak, Differential Geometry , Volume 1, p.289. 
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If the form field lv is represented in a coordinate basis 


n— 1 

u> = 22 ai 0 ,...,i fe _ 1 dx l ° A • • • A dx ifc_1 (5.25) 

io— 0 

then the exterior derivative can be expressed as 

n— 1 

da; = '22 daj 0i ... i j fc _ 1 A dx*° A • • • A dx u_1 . (5.26) 

io=0,...,ik-i= 0 

Though this formula is expressed in terms of a coordinate basis, 
the result is independent of the choice of coordinate system. 

Computing Exterior Derivatives 

We can test that the computation indicated by equation (5.24) 
is equivalent to the computation indicated by equation (5.26) in 
three dimensions with a general one- form field: 

(define a (literal -manifold-function ’alpha R3-rect)) 

(define b (literal -manifold-function ’beta R3-rect)) 

(define c (literal -manifold-function ’gamma R3-rect)) 

(define theta (+ (* a dx) (* b dy) (* c dz))) 

The test will require two arbitrary vector fields 

(define X (literal-vector-field ’X-rect R3-rect)) 

(define Y (literal-vector-field ’Y-rect R3-rect)) 

(((- (d theta) 

(+ (wedge (d a) dx) 

(wedge (d b) dy) 

(wedge (d c) dz))) 

X Y) 

R3-rect-point) 

0 

We can also try a general two-form field in 3-dimensional space: 
Let 

uj = ady A dz + bdz A dx + cdx A dy, (5.27) 

where a = a o x, b = (3 o x, c = 7 o y, and a , (3, and 7 are 
real-valued functions of three real arguments. As a program, 
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(define omega 

(+ (* a (wedge dy dz)) 

(* b (wedge dz dx)) 

(* c (wedge dx dy)))) 

Here we need another vector field because our result will be a 
three-form field. 

(define Z (literal-vector-field ’Z-rect R3-rect)) 

(((- (d omega) 

(+ (wedge (d a) dy dz) 

(wedge (d b) dz dx) 

(wedge (d c) dx dy) ) ) 

X Y Z) 

R3-rect-point) 

0 

Properties of Exterior Derivatives 

The exterior derivative of the wedge of two form fields obeys the 
graded Leibniz rule. It can be written in terms of the exterior 
derivatives of the component form fields: 

d(of At) = dcj A r + (— l) fc u A dr, (5.28) 

where k is the rank of u :. 

A form field lo that is the exterior derivative of another form 
field u> = 66 is called exact. A form field whose exterior derivative 
is zero is called closed. 

Every exact form field is a closed form field: applying the exte- 
rior derivative operator twice always yields zero: 

d 2 uj = 0. (5.29) 

This is equivalent to the statement that partial derivatives with 
respect to different variables commute. 8 

It is easy to show equation (5.29) for manifold functions: 

d 2 f(u,v) = d(df)(u, v) 

= u(df(v)) - v(df(u)) - df([u, v]) 

= u(v(f))-v(u(f))-[u,v](f) 

= 0 (5.30) 


See Spivak, Calculus on Manifolds, p.92 




5. 3 Stokes ’s Theorem 


65 


Consider the general one-form field 6 defined on 3-dimensional 
rectangular space. Taking two exterior derivatives of d yields a 
three-form field. It is zero: 

( ( (d (d theta)) X Y Z) R3-rect-point) 

0 


Not every closed form field is an exact form field. Whether a 
closed form field is exact depends on the topology of a manifold. 


5.3 Stokes’s Theorem 

The proof of the general Stokes’s Theorem for n-dimensional ori- 
entable manifolds is quite complicated, but it is easy to see how 
it works for a 2-dimensional region M that can be covered with a 
single coordinate patch. 9 

Given a coordinate chart x( m ) = (x(m),y(m)) we can obtain a 
pair of coordinate-basis vectors d/dx = Xq and d/dy = X\. 

The coordinate image of M can be divided into small rectan- 
gular areas in the (x, y ) coordinate plane. The union of the rect- 
angular areas gives the coordinate image of M. The clockwise 
integrals around the boundaries of the rectangles cancel on neigh- 
boring rectangles, because the boundary is traversed in opposite 
directions. But on the boundary of the coordinate image of M 
the boundary integrals do not cancel, yielding an integral on the 
boundary of M. Area integrals over the rectangular areas add to 
produce an integral over the entire coordinate image of M. 

So, consider Stokes’s Theorem on a small patch P of the mani- 
fold for which the coordinates form a rectangular region < 

x < x max and y m in < V < Vmax)- Stokes’s Theorem on P states 



The area integral on the right can be written as an ordinary mul- 
tidimensional integral using the coordinate basis vectors (recall 


9 We do not develop the machinery for integration on chains that is usually 
needed for a full proof of Stokes’s Theorem. This is adequately done in other 
books. A beautiful treatment can be found in Spivak, Calculus on Manifolds 

[ 17 ]- 
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that the integral is independent of the choice of coordinates): 

‘ du (d/dx, a/ay) o x” 1 (5.32) 




j max / ymax 


(d/dx(oo(d/dy)) - d/dy(u>(d/dx))) o x _1 . 


min “ ymin 


We have used equation (5.23) to expand the exterior derivative. 

Consider just the first term of the right-hand side of equa- 
tion (5.32). Then using the definition of basis vector field d/dx 
we obtain 


nax / ymax 


(d/dx(u(d/dy)) o x x ) 

(X 0 (u>{d/dy)) ox^ 1 ) 


max / ymax 


d 0 {(u(d/dy)) o X x ) . 


(5.33) 


This integral can now be evaluated using the Fundamental The- 
orem of Calculus. Accumulating the results for both integrals 


'x(P) 


da; (d/dx, d/dy) o x 


-i 


((a ’(d/dx)) ox 1 ) (. x,y min )dx 
((U)(d/dy)) OX" 1 ) (■ Xmax , y)dy 

max 

{(u>(d/dx)) ox -1 ) (. x,y max )dx 
{{ u {d/ dy)) o x _1 ) (xmin,y)dy 


min 

Umax 


= / w, 

JdP 


(5.34) 


as was to be shown. 
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5.4 Vector Integral Theorems 

Green’s Theorem states that for an arbitrary compact set M C R 2 , 
a 2-dimensional Euclidean space: 

[ ((a°x)dx+ (/3ox)dy) = f ((3 0 /3 - dia) o x ) dx A dy.(5.35) 

JdM JM 

We can test this. By Stokes’s Theorem, the integrands are related 
by an exterior derivative. We need some vectors to test our forms: 

(define v (literal-vector-field ’v-rect R2-rect)) 

(define w (literal-vector-field ’w-rect R2-rect)) 

We can now test our integrands: 10 

(define alpha (literal-function ’alpha R2->R)) 

(define beta (literal-function ’beta R2->R) ) 


(let ((dx (ref (basis->lf orm-basis R2-rect-basis) 0)) 
(dy (ref (basis->lf orm-basis R2-rect-basis) 1))) 
(((- (d (+ (* (compose alpha (chart R2-rect)) dx) 

(* (compose beta (chart R2-rect)) dy))) 

(* (compose (- ((partial 0) beta) 

((partial 1) alpha)) 

(chart R2-rect)) 

(wedge dx dy) ) ) 

v w) 

R2-rect-point) ) 


We can also compute the integrands for the Divergence Theo- 
rem: For an arbitrary compact set M C R 3 and a vector field w 



div(w) dV 


w • n dA 


IdM 


(5.36) 


where n is the outward-pointing normal to the surface dM. Again, 
the integrands should be related by an exterior derivative, if this 
is an instance of Stokes’s Theorem. 


10 Using (define R2-rect-basis (coordinate-system->basis R2-rect)). 

Here we extract dx and dy from R2-rect-basis to avoid globally installing 
coordinates. 
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Note that even the statement of this theorem cannot be made 
with the machinery we have developed at this point. The con- 
cepts “outward-pointing normal,” area A, and volume V on the 
manifold are not definable without using a metric (see Chapter 9). 
However, for orthonormal rectangular coordinates in R 3 we can 
interpret the integrands in terms of forms. 

Let the vector field describing the flow of stuff be 


d d d 

w = a W + b W + C W 
ox ay o z 


(5.37) 


The rate of leakage of stuff through each element of the bound- 
ary is w • n dA. We interpret this as the two- form 


a dy A dz + b dz A dx + c dx A dy, 


(5.38) 


because any part of the boundary will have y-z, z-x, and x-y 
components, and each such component will pick up contributions 
from the normal component of the flux w. Formalizing this as 
code we have 


(define a (literal -manifold-function ’a-rect R3-rect)) 
(define b (literal -manifold-function ’b-rect R3-rect)) 
(define c (literal -manifold-function ’c-rect R3-rect)) 


(define flux-through-boundary-element 
(+ (* a (wedge dy dz)) 

(* b (wedge dz dx)) 

(* c (wedge dx dy)))) 


The rate of production of stuff in each element of volume is 
div(w) dV. We interpret this as the three- form 


d d , d 

w -3 + w~b + — c 

ox ay o z 


dx A dy A dz. 


(5.39) 


or: 


(def ine product ion- in-volume-element 
(* (+ (d/dx a) (d/dy b) (d/dz c)) 

(wedge dx dy dz))) 

Assuming Stokes’s Theorem, the exterior derivative of the leak- 
age of stuff per unit area through the boundary must be the rate of 
production of stuff per unit volume in the interior. We check this 
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by applying the difference to arbitrary vector fields at an arbitrary 
point: 

(define X (literal-vector-field ’X-rect R3-rect)) 

(define Y (literal-vector-field ’Y-rect R3-rect)) 

(define Z (literal-vector-field ’Z-rect R3-rect)) 

(((- production-in-volume-element 

(d flux-through-boundary-element) ) 

X Y Z) 

R3-rect-point) 

0 

as expected. 

Exercise 5.2: Graded Formula 

Derive equation (5.28). 

Exercise 5.3: Iterated Exterior Derivative 

We have shown that the equation (5.29) is true for manifold functions. 
Show that it is true for any form field. 
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Over a Map 


To deal with motion on manifolds we need to think about paths on 
manifolds and vectors along these paths. Tangent vectors along 
paths are not vector fields on the manifold because they are de- 
fined only on the path. And the path may even cross itself, which 
would give more than one vector at a point. Here we introduce 
the concept of a vector field over a map. 1 A vector field over a 
map assigns a vector to each image point of the map. In general 
the map may be a function from one manifold to another. If the 
domain of the map is the manifold of the real line, the range of 
the map is a 1-dimensional path on the target manifold. One pos- 
sible way to define a vector field over a map is to assign a tangent 
vector to each image point of a path, allowing us to work with 
tangent vectors to paths. A one-form field over the map allows us 
to extract the components of a vector field over the map. 


6.1 Vector Fields Over a Map 

Let [J, be a map from points n in the manifold N to points m in the 
manifold M . A vector over the map /j takes directional derivatives 
of functions on M at points m = /r(n). The vector over the map 
applied to the function on M is a function on N. 

Restricted Vector Fields 

One way to make a vector field over a map is to restrict a vector 
field on M to the image of N over //, as illustrated in figure 6.1. 
Let v be a vector field on M, and f a function on M. Then 

M f ) = v(f) ofi, (6.1) 

is a vector over the map pi. Note that v^(f) is a function on N, 
not M: 

V( f )( n ) = v (0(M n ))- (6.2) 


1 See Bishop and Goldberg, Tensor Analysis on Manifolds [3]. 
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Figure 6.1 The vector field v on M is indicated by arrows. The solid 
arrows are v M , the restricted vector field over the map p. The vector 
field over the map is restricted to the image of N in M. 


We can implement this definition as: 

(define ( (vector-f ield->vector-f ield-over-map mu:N->M) v-on-M) 
(procedure->vector-f ield 
(lambda (f-on-M) 

(compose (v-on-M f-on-M) mu:N->M)))) 


Differential of a Map 

Another way to construct a vector field over a map n is to trans- 
port a vector field from the source manifold N to the target man- 
ifold M with the differential of the map 

d^(v)(f)(n) = v(f o n)( n), (6.3) 

which takes its argument in the source manifold N. The differen- 
tial of a map p applied to a vector field v on N is a vector field 
over the map. A procedure to compute the differential is: 

(define (((differential mu) v) f) 

(v (compose f mu))) 
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The nomenclature of this subject is confused. The “differential 
of a map between manifolds,” dpi, takes one more argument than 
the “differential of a real-valued function on a manifold,” df, but 
when the target manifold of y is the reals and / is the identity 
function on the reals, 

dn(v)(I ){ n ) = (v(/ o /z))(n) = (v(/i))(n) = djti(v)(n). (6.4) 

We avoid this problem in our notation by distinguishing d and d. 
In our programs we encode d as differential and d as d. 

Velocity at a Time 

Let y be the map from the time line to the manifold M, and d/dt 
be a basis vector on the time line. Then dy(d/d t) is the vector 
over the map y that computes the rate of change of functions on 
M along the path that is the image of y. This is the velocity 
vector. We can use the differential to assign a velocity vector to 
each moment, solving the problem of multiple vectors at a point 
if the path crosses itself. 


6.2 One-Form Fields Over a Map 

Given a one- form u on the manifold M, the one- form over the 
map y : N — > M is constructed as follows: 

w /i (v /i )(n) = w(u)(/x(n)), where u(f)(m) = v /i (f)(n). (6.5) 

The object u is not really a vector field on M even though we have 
given it that shape so that the dual vector can apply to it; u(f) is 
evaluated only at images m = y ( n ) of points n in N. If we were 
defining u as a vector field we would need the inverse of y to find 
the point n = /ir _1 (m), but this is not required to define the object 
u in a context where there is already an m associated with the n 
of interest. To extend this idea to k- forms, we carry each vector 
argument over the map. 

The procedure that constructs a fc-form over the map from a 
A;- form is: 
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(define ( (f orm-f ield->f orm-f ield-over-map mu:N->M) w-on-M) 
(define (make-fake-vector-field V-over-mu n) 

(define ((u f) m) 

((V-over-mu f) n)) 

(procedure->vector-f ield u)) 

(procedure->nf orm-f ield 
(lambda vectors-over-map 
(lambda (n) 

((apply w-on-M 

(map (lambda (V-over-mu) 

(make-fake-vector-field V-over-mu n) ) 
vectors-over-map) ) 

(mu:N->M n) ) ) ) 

(get-rank w-on-M))) 

The internal procedure make-fake-vector-field counterfeits a 
vector field u on M from the vector field over the map fi : N — > M. 
This works here because the only value that is ever passed as m is 
(mu: N->M n) . 


6.3 Basis Fields Over a Map 

Let e be a tuple of basis vector fields, and e be the tuple of basis 
one- forms that is dual to e: 

e*(ej)(m ) = <5J. (6.6) 

The basis vectors over the map , e M , are particular cases of vectors 
over a map: 

e A ‘(f) = e(f) o p. (6-7) 

And the elements of the dual basis over the map, e M , are particular 
cases of one-forms over the map. The basis and dual basis over 
the map satisfy 

Sj,(e?)(n )=$. 


(6.8) 
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Walking on a Sphere 

For example, let /i map the time line to the unit sphere. 2 We use 
colatitude 6 and longitude </> as coordinates on the sphere: 

(define S2 (make-manifold S"2 23)) 

(define S2-spherical 

(coordinate-system-at ’spherical ’north-pole S2)) 

(def ine-coordinates (up theta phi) S2-spherical) 

(define S2-basis (coordinate-system->basis S2-spherical) ) 

A general path on the sphere is: 3 

(define mu 

(compose (point S2-spherical) 

(up (literal-function ’theta) 

(literal-function ’phi)) 

(chart Rl-rect))) 

The basis over the map is constructed from the basis on the sphere: 

(define S2-basis-over-mu 

(basis->basis-over-map mu S2-basis)) 

(define h 

(literal -manifold-function ’h-spherical S2-spherical) ) 

( ( (basis->vector-basis S2-basis-over-mu) h) 

((point Rl-rect) ’tO)) 

( down 

(((partial 0) h-spherical) (up (theta tO) (phi tO ) ) ) 
(((partial 1) h-spherical) (up (theta tO) (phi tO ) ) ) ) 


The basis vectors over the map compute derivatives of the function 
h evaluated on the path at the given time. 


2 We execute (def ine-coordinates t Rl-rect) to make t the coordinate func- 
tion of the real line. 

3 We provide a shortcut to make literal manifold maps: 

(define mu (literal -manifold-map ’mu Rl-rect S2-spherical) ) 

But if we used this shortcut, the component functions would be named mu~0 
and mu'l. Here we wanted to use more mnemonic names for the component 
functions. 
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We can check that the dual basis over the map does the correct 
thing: 

( ( (basis->lf orm-basis S2-basis-over- mu) 

(basis->vector-basis S2-basis-over-mu) ) 

((point Rl-rect) ’tO)) 

(up ( down 10) ( down 0 1 ) ) 

Components of the Velocity 

Let x be a tuple of coordinates on M , with associated basis vectors 
X,j, and dual basis elements dx*. The vector basis and dual basis 
over the map /i are X. 1 and dx^. The components of the velocity 
(rates of change of coordinates along the path //) are obtained by 
applying the dual basis over the map to the velocity 

v'(t) = dx^(d/i(<9/dt))(t), (6.9) 

where t is the coordinate for the point t. 

For example, the coordinate velocities on a sphere are 

( ( (basis->lf orm-basis S2-basis-over-mu) 

((differential mu) d/dt)) 

((point Rl-rect) ’tO)) 

(up ( (D theta) tO ) ( (D phi) tO ) ) ) 

as expected. 


6.4 Pullbacks and Pushforwards 

Maps from one manifold to another can also be used to relate 
the vector fields and one-form fields on one manifold to those 
on the other. We have introduced two such relations: restricted 
vector fields and the differential of a function. However, there are 
other ways to relate the vector fields and form fields on different 
manifolds that are connected by a map. 

Pullback and Pushforward of a Function 

The pullback of a function f on M over the map y is defined as 

H*f = f o/i. (6.10) 
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This allows us to take a function defined on M and use it to define 
a new function on N. 

For example, the integral curve of v evolved for time t as a 
function of the initial manifold point m generates a map q % of 
the manifold onto itself. This is a simple currying 4 of the integral 
curve of v from m as a function of time: m) = 7m(*)- The 

evolution of the function f along an integral curve, equation (3.33), 
can be written in terms of the pullback over ff: 

(E t>v f)(m)=f(^(m)) = M)*f)(m). (6.11) 

This is implemented as: 

(define ((pullback-function mu:N->M) f-on-M) 

(compose f-on-M mu:N->M)) 

A vector field over the map that was constructed by restric- 
tion (equation 6.1) can be seen as the pullback of the function 
constructed by application of the vector field to a function: 

M f ) = v(f) o n = /i*(v(f)). (6.12) 

A vector field over the map that was constructed by a differen- 
tial (equation 6.3) can be seen as the vector field applied to the 
pullback of the function: 

dn(y){ f)(n) = v(f o /r)(n) = v(^*f)(n). (6.13) 

If we have an inverse for the map /j, we can also define a push- 
forward of the function g, defined on the source manifold of the 
map: 5 

M*g = g ° P 1 - (6-14) 


4 A function of two arguments may be seen as a function of one argument whose 
value is a function of the other argument. This can be done in two different 
ways, depending on which argument is supplied first. The general process of 
specifying a subset of the arguments to produce a new function of the others 
is called currying the function, in honor of the logician Haskell Curry (1900— 
1982) who, with Moses Schonfinkel (1889-1942), developed combinatory logic. 

s Notation note: superscript asterisk indicates pullback, subscript asterisk indi- 
cates pushforward. Pullbacks and pushforwards are tightly binding operators, 
so, for example p* f( n) = {n* f){ n). 
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Pushforward of a Vector Field 

We can also define the pushforward of a vector field over the map /i. 
The pushforward takes a vector field v defined on N. The result 
takes directional derivatives of functions on M at a place deter- 
mined by a point in M: 

M* v (f)(m) = v(/x* f)(/x _1 (m)) = v(f o /x)(m _ 1 ( m)), (6.15) 

or 

M* v (f) = /x*(v(/x*f)). (6.16) 

Here we expressed the pushforward of the vector field in terms of 
pullbacks and pushforwards of functions. Note that the pushfor- 
ward requires the inverse of the map. 

If the map is from time to some configuration manifold and 
represents the time evolution of a process, we can think of the 
pushforward of a vector field as a velocity measured at a point 
on the trajectory in the configuration manifold. By contrast, the 
differential of the map applied to the vector field gives us the 
velocity vector at each moment in time. Because a trajectory may 
cross itself, the pushforward is not defined at any point where the 
crossing occurs, but the differential is always defined. 

Pushforward Along Integral Curves 

We can push a vector field forward over the map generated by an 
integral curve of a vector field w, because the inverse is always 
available. 6 

(te w ).v)(f)(m) =v((0D*f)(^ t (m)) =v(fo^)(f_ t (m)). (6.17) 

This is implemented as: 

(define ( (pushf orward-vector mu:N->M mu~-l:M->N) v-on-N) 
(procedure->vector-f ield 
(lambda (f) 

(compose (v-on-N (compose f mu:N->M)) mu~-l : M->N) ) ) ) 


6 The map (j>7 is always invertible: (4>t) 1 = </>“t because of the uniqueness of 
the solutions of the initial- value problem for ordinary differential equations. 
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Pullback of a Vector Field 

Given a vector field v on manifold M we can pull the vector field 
back through the map n : N — > M as follows: 

M*v( f )(n) = (v(f o ^ _1 ))(^(n)) (6.18) 

or 

At*v(f) = f)). (6.19) 

This may be useful when the map is invertible, as in the flow 
generated by a vector field. 

This is implemented as: 

(define (pullback-vector-field mu:N->M mu~-l : M->N) 

(pushf orward-vector mu“-l:M->N mu:N->M)) 

Pullback of a Form Field 

We can also pull back a one- form field u> defined on M, but an 
honest definition is rarely written. The pullback of a one-form 
Held applied to a vector field is intended to be the same as the 
one-form field applied to the pushforward of the vector field. 

The pullback of a one-form field is often described by the rela- 
tion 

H*u>(y) = u;(/x*v), (6.20) 

but this is wrong, because the two sides are not functions of points 
in the same manifold. The one- form field u applies to a vector 
field on the manifold M, which takes a directional derivative of a 
function defined on M and is evaluated at a point on M, but the 
left-hand side is evaluated at a point on the manifold N. 

A more precise description would be 


M*u>( v )(n) = w(/i*v)(/i(n)) 

(6.21) 

or 


= //(u>(/r*v)). 

(6.22) 
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Although this is accurate, it may not be effective, because com- 
puting the pushforward requires the inverse of the map fi. But 
the inverse is available when the map is the flow generated by a 
vector field. 

In fact it is possible to compute the pullback of a one-form 
field without having the inverse of the map. Instead we can use 
f orm-f ield->f orm-f ield-over-map to avoid needing the inverse: 

//u>(v)(n) = u/ l (d^(v))(n). (6.23) 

The pullback of a /c-form generalizes equation 6.21: 

MM u,v, ...)(n) = w(/i*u,/i*v,...)(Ai(n)). (6.24) 

This is implemented as follows: 7 

(define ((pullback-form mu:N->M) omega-on-M) 

(let ((k (get-rank omega-on-M))) 

(if (= k 0) 

((pullback-function mu:N->M) omega-on-M) 

(procedure->nf orm-f ield 
(lambda vectors-on-N 

(apply ( (f orm-f ield->f orm-f ield-over-map mu:N->M) 
omega-on-M) 

(map (differential mu:N->M) vectors-on-N))) 

k)))) 

Properties of Pullback 

The pullback through a map has many nice properties: it dis- 
tributes through addition and through wedge product: 

+ = l i*0 + n*<t>, (6.25) 

H*(0A4>) =n*6A^*(f). (6.26) 

The pullback also commutes with the exterior derivative: 

d(/x*0) = /i*(d0), (6.27) 

for 0 a function or A:- form field. 


7 There is a generic pullback procedure that operates on any kind of manifold 
object. However, to pull a vector field back requires providing the inverse 
map. 
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We can verify this by computing an example. Let fi map the 
rectangular plane to rectangular 3-space: 

(define mu (literal-manifold-map ’MU R2-rect R3-rect)) 

First, let’s compare the pullback of the exterior derivative of a 
function with the exterior derivative of the pullback of the func- 
tion: 

(define f (literal -manifold-function ’f-rect R3-rect)) 

(define X (literal-vector-field ’X-rect R2-rect)) 

(((- ((pullback mu) (d f)) (d ((pullback mu) f))) X) 

((point R2-rect) (up ’xO ’yO))) 

0 

More generally, we can consider what happens to a form field. For 
a one-form field the result is as expected: 

(define theta (literal-lf orm-f ield ’THETA R3-rect)) 

(define Y (literal-vector-field ’Y-rect R2-rect)) 

(((- ((pullback mu) (d theta)) (d ((pullback mu) theta))) X Y) 
((point R2-rect) (up ’xO ’yO))) 

0 

Pushforward of a Form Field 

By symmetry, it is possible to define the pushforward of a one- 
form field as 

/i*u>(v) = n*(v(lJi*v)), (6.28) 

but this is rarely useful. 

Exercise 6.1: Velocities on a Globe 

We can use manifold functions, vector fields, and one-forms over a map 
to understand how paths behave. 

a. Suppose that a vehicle is traveling east on the Earth at a given rate 
of change of longitude. What is the actual ground speed of the vehicle? 

b. Stereographic projection is useful for navigation because it is confor- 
mal (it preserves angles). For the situation of part a, what is the speed 
measured on a stereographic map? Remember that the stereographic 
projection is implemented with S2-Riemann. 





7 

Directional Derivatives 


The vector field was a generalization of the directional derivative 
to functions on a manifold. When we want to generalize the direc- 
tional derivative idea to operate on other manifold objects, such 
as directional derivatives of vector fields or of form fields, there 
are several useful choices. In the same way that a vector field ap- 
plies to a function to produce a function, we will build directional 
derivatives so that when applied to any object it will produce an- 
other object of the same kind. All directional derivatives require 
a vector field to give the direction and scale factor. 

We will have a choice of directional derivative operators that 
give different results for the rate of change of vector and form 
fields along integral curves. But all directional derivative oper- 
ators must agree when computing rates of change of functions 
along integral curves. When applied to functions, all directional 
derivative operators give: 

2>v(f) = v(f). (7.1) 

Next we specify the directional derivative of a vector field u 
with respect to a vector field v. Let an integral curve of the vector 
field v be 7 , parameterized by t. and let m = 7 (t). Let u' be a 
vector field that results from transporting the vector field u along 
7 for a parameter increment 5. How u is transported to make 1 / 
determines the type of derivative. We formulate the method of 
transport by: 

u' = F$u. (7.2) 

We can assume without loss of generality that F7 u is a linear trans- 
formation over the reals on u, because we care about its behavior 
only in an incremental region around <5 = 0 . 

Let g be the comparison of the original vector field at a point 
with the transported vector field at that point: 


g{5) = u(f)(m) - (F 5 v u) (f)(m). 


(7.3) 
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So we can compute the directional derivative operator using only 
ordinary derivatives: 

D v u(f)(m) = Dg(0). (7.4) 

The result V y u is of type vector field. 

The general pattern of constructing a directional derivative op- 
erator from a transport operator is given by the following schema: 1 

(define ( ( ( ( (F->directional-derivative F) v) u) f) m) 

(define (g delta) 

(- ((u f) m) (((((F v) delta) u) f) m) ) ) 

((D g) 0)) 

The linearity of transport implies that 

£> v (aO + /?P) = aP v O + /3D v P, (7.5) 

for any real a and /3 and manifold objects 0 and P. 

The directional derivative obeys superposition in its vector-field 
argument: 

^v+w = T> y + V w . (7.6) 

The directional derivative is homogeneous over the reals in its 
vector- field argument: 

V av = aV v , (7.7) 

for any real a. 2 This follows from the fact that for evolution along 
integral curves: when cc is a real number, 

O) = Kt{ m). (7.8) 

When applied to products of functions, directional derivative 
operators satisfy Leibniz’s rule: 

2? v (fg)=f(P v g) + (2? v f)g. (7.9) 


lr Oie directional derivative of a vector field must itself be a vector field. Thus 
the real program for this must make the function of f into a vector field. 
However, we leave out this detail here to make the structure clear. 

2 For some derivative operators a can be a real- valued manifold function. 
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The Leibniz rule is extended to applications of one-form fields to 
vector fields: 

A/ My)) = u (X> v y) + (P v a>) (y). (7.10) 

The extension of the Leibniz rule, combined with the choice of 
transport of a vector field, determines the action of the directional 
derivative on form fields. 3 


7.1 Lie Derivative 

The Lie derivative is one kind of directional derivative operator. 
We write the Lie derivative operator with respect to a vector field 
v as 

Functions 

The Lie derivative of the function f with respect to the vector field 
v is given by 

£ v f = v(f). (7.11) 

The tangent vector v measures the rate of change of f along inte- 
gral curves. 

Vector Fields 

For the Lie derivative of a vector field y with respect to a vector 
field v we choose the transport operator F$ y to be the pushforward 
of y along the integral curves of v. Recall equation (6.15). So the 
Lie derivative of y with respect to v at the point m is 

My) (f)(m) = Dg( 0), (7.12) 

where 

9(8) = y( f )( m ) - (( 4 >s)* y)( f )( m )- ( 7 - 13 ) 

We can construct a procedure that computes the Lie derivative 
of a vector field by supplying an appropriate transport operator 


3 The action on functions, vector fields, and one-form fields suffices to define 
the action on all tensor fields. See Appendix C. 
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(F-Lie phi) for F in our schema F->directional-derivative. 
In this first stab at the Lie derivative, we introduce a coordinate 
system and we expand the integral curve to a given order. Because 
in the schema we evaluate the derivative of g at 0, the dependence 
on the order and the coordinate system disappears. They will not 
be needed in the final version. 

(define (Lie-directional coordsys order) 

(let ((Phi (phi coordsys order))) 

(F->directional-derivative (F-Lie Phi)))) 

(define (((F-Lie phi) v) delta) 

(pushf orward-vector ((phi v) delta) ((phi v) (- delta)))) 

(define ((((phi coordsys order) v) delta) m) 

((point coordsys) 

(series: sum (((exp (* delta v)) (chart coordsys)) m) 
order) ) ) 

Expand the quantities in equation (7.13) to first order in 5: 

g($) =y(f)(m) - (Cy)( f )( m ) 

= y(f)(m) -y(f o$)(0-*(m)) 

= (y(f) - y(f + <M f ) h — ) + <My( f + h — )))( m) h — 

= (-<fy(v(f)) + <Sv(y(f)))(m) + ••• 

= <5 [v, y] (f)(m) + O (<J 2 ) . (7.14) 

So the Lie derivative of a vector field y with respect to a vector 
field v is a vector field that is defined by its behavior when applied 
to an arbitrary manifold function f: 

(£ v y) (f) = [v, y] (f) (7.15) 

Verifying this computation 

(let ((v (literal-vector-field ’v-rect R3-rect)) 

(w (literal-vector-field ’w-rect R3-rect)) 

(f (literal -manifold-function ’f-rect R3-rect))) 

((- ((((Lie-directional R3-rect 2) v) w) f) 

((commutator v w) f)) 

((point R3-rect) (up ’xO ’yO ’zO)))) 


0 
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Although this is tested to second order, evaluating the derivative 
at zero ensures that first order is enough. So we can safely define: 

(define ((Lie-derivative-vector V) Y) 

(commutator V Y)) 

We can think of the Lie derivative as the rate of change of the 
manifold function y(f) as we move in the v direction, adjusted to 
take into account that some of the variation is due to the variation 
off: 

(£ v y) (f) = [v,y] (f) 

= v(y(f))-y(v(f)) 

= v(y(f)) - y (£ v (f)) • (7.16) 

The first term in the commutator, v(y(f)), measures the rate of 
change of the combination y(f) along the integral curves of v. The 
change in y(f) is due to both the intrinsic change in y along the 
curve and the change in f along the curve; the second term in 
the commutator subtracts this latter quantity. The result is the 
intrinsic change in y along the integral curves of v. 

Additionally, we can extend the product rule, for any manifold 
function g and any vector field u: 

£ v (gu)(f) = [v, gu] (f ) 

= v(g)u(f) +g[v,u](f) 

= (£ v g)u(f) +g(£ v u)(f). (7.17) 

An Alternate View 

We can write the vector field 

y(f) = XV e< ( f )- ( 7 - 18 ) 


By the extended product rule (equation 7.17) we get 
£ v y(f) = 5^(v(y*)ei(f) +y\C„ei(f)). 


(7.19) 
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Because the Lie derivative of a vector field is a vector field, we can 
extract the components of £ v e,; using the dual basis. We define 
A*-(v) to be those components: 

A* (v) = e* (£ v ej) = e*([v, ej}). (7.20) 

So the Lie derivative can be written 

(A, y) (f) = E (v(y f ) + E A' (v)y'j *(/). (7.21) 

The components of the Lie derivatives of the basis vector fields 
are the structure constants for the basis vector fields. (See equa- 
tion 4.37.) The structure constants are antisymmetric in the lower 
indices: 

e* (£ efc ej ) = e* ( [e* , ey] ) = d l kj . (7.22) 

Resolving v into components and applying the product rule, we 
get 

(£„y) (f) = E (v*[e*,y](f) - y(v*)e fc (f)) . (7.23) 

k 

So A® is related to the structure constants by 

A)(v) = e* (£»ey) 

= E - ey( v, =)eUe t )) 

k 

= E ( v % - e U'' fc )<si) 

k 

= E''' i 4y-%(v‘). (7.24) 

k 


Note: Despite their appearance, the A) are not form fields because 
A'ifv} /fA'(v). 
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Form Fields 

We can also define the Lie derivative of a form field u with respect 
to the vector field v by its action on an arbitrary vector field y, 
using the extended Leibniz rule (see equation 7.10): 

(A/M) (y) = v My)) - w (Ay) • (7.25) 

The first term computes the rate of change of the combination 
u>(y) along the integral curve of v, while the second subtracts u> 
applied to the change in y. The result is the change in u> along 
the curve. 

The Lie derivative of a fc-form field u with respect to a vec- 
tor field v is a /c-form field that is defined by its behavior when 
applied to k arbitrary vector fields wo, . . . , w^_i. We generalize 
equation (7.25): 

£Mw 0 ,...,w fc _i) (7.26) 

fc-i 

= v(w(w 0 , . . . , w fc _i)) - Y v(w 0 , • • • , Aw i, W fc _i). 

i = 0 


Uniform Interpretation 

Consider abstracting the equations (7.16), (7.25), and (7.27). The 
Lie derivative of an object, a, that can apply to other objects, b, 
to produce manifold functions, a(b) : M — > R n , is 

(Aa) (b) = v (a(b)) - a (Ab) . (7.27) 

The first term in this expression computes the rate of change of 
the compound object a ( b ) along integral curves of v, while the 
second subtracts the change in a due to the change in b along the 
curves. The result is a measure of the “intrinsic” change in a along 
integral curves of v, with b held “fixed.” 

Properties of the Lie Derivative 

As required by properties 7. 7-7. 5, the Lie derivative is linear in 
its arguments: 


£av+/3w — OlC-y T /3A, 


(7.28) 
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and 

£ v (a a + /3b) = a£ v a + /3£ v b, (7.29) 

with a, /3 € R and vector fields or one-form fields a and b. 

For any /c-form field ui and any vector field v the exterior deriva- 
tive commutes with the Lie derivative with respect to the vector 
field: 

£ v (dw) = d(£ v cj). (7.30) 

If lj is an element of surface then dcu is an element of volume. 
The Lie derivative computes the rate of change of its argument 
under a deformation described by the vector field. The answer 
is the same whether we deform the surface before computing the 
volume or compute the volume and then deform it. 

We can verify this in 3-dimensional rectangular space for a gen- 
eral one- form field: 4 

(((- ((Lie-derivative V) (d theta)) 

(d ((Lie-derivative V) theta))) 

X Y) 

R3-rect-point) 

0 

and for the general two- form field: 


4 In these experiments we need some setup. 

(define a (literal -manifold-function ’alpha R3-rect)) 
(define b (literal -manifold-function ’beta R3-rect)) 
(define c (literal -manifold-function ’gamma R3-rect)) 


(def ine-coordinates (up x y z) R3-rect) 


(define theta (+ (* a dx) (* b dy) (* c dz))) 

(define omega 

(+ (* a (wedge dy dz)) 

(* b (wedge dz dx)) 

(* c (wedge dx dy)))) 


(define X (literal-vector-field ’X-rect 
(define Y (literal-vector-field ’Y-rect 
(define Z (literal-vector-field ’Z-rect 
(define V (literal-vector-field ’V-rect 
(define R3-rect-point 

((point R3-rect) (up ’xO ’yO ’z0))) 


R3-rect) ) 
R3-rect) ) 
R3-rect) ) 
R3-rect) ) 
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(((- ((Lie-derivative V) (d omega)) 
(d ((Lie-derivative V) omega))) 
X Y Z) 

R3-rect-point) 

0 


The Lie derivative satisfies a another nice elementary relation- 
ship. If v and w are two vector fields, then 

[£ v ,£ w ] =£ [V)W] . (7.31) 

Again, for our general one-form held 0: 

((((- (commutator (Lie-derivative X) (Lie-derivative Y)) 
(Lie-derivative (commutator X Y))) 
theta) 

Z) 

R3-rect-point) 

0 

and for the two- form held cj: 

((((- (commutator (Lie-derivative X) (Lie-derivative Y)) 
(Lie-derivative (commutator X Y))) 
omega) 

Z V) 

R3-rect-point) 

0 

Exponentiating Lie Derivatives 

The Lie derivative computes the rate of change of objects as they 
are advanced along integral curves. The Lie derivative of an object 
produces another object of the same type, so we can iterate Lie 
derivatives. This gives us Taylor series for objects along the curve. 

The operator e tCw = 1 + tC v + + . . . evolves objects along 

the curve by parameter t. For example, the exponential of a Lie 
derivative applied to a vector held is 

1 2 

e t£v y = y + tCy y + — £ v 2 y H 

t 2 

= y + t[v,y] + — [v, [v, y]] H . 


(7.32) 
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Consider a simple case. We advance the coordinate-basis vector 
field d/dy by an angle a around the circle. Let J z = xd/dy — 
yd/dx, the circular vector field. We recall 

(define Jz (- (* x d/dy) (* y d/dx))) 

We can apply the exponential of the Lie derivative with respect to 
J 2 to d/dy. We examine how the result affects a general function 
on the manifold: 

(series : f or-each print-expression 

((((exp (* ’a (Lie-derivative Jz))) d/dy) 

(literal -manifold-function ’f-rect R3-rect)) 

((point R3-rect) (up 1 0 0))) 

5) 

(( (partial 0) f-rect) (up 1 0)) 

(* -1 a (((partial 1) f-rect) (up 1 0))) 

(* -1/2 (expt a 2) (((partial 0) f-rect) (up 1 0))) 

(* 1/6 (expt a 3) (((partial 1) f-rect) (up 1 0))) 

(* 1/24 (expt a 4) (((partial 0) f-rect) (up 1 0))) 

; Value: . . . 

Apparently the result is 

0 0 0 

exp (aC( xd/dy -yd/dx)) ^ = - sin ( a )^ + cos ^q^- ( 7 - 33 ) 

Interior Product 

There is a simple but useful operation available between vector 
fields and form fields called interior product. This is the substitu- 
tion of a vector field v into the first argument of a p-fornr field u> 
to produce a p — 1-form field: 

(i v o;)(vi, . . . v p _i) = w(v, vi , . . . v p _i). (7.34) 

There is a mundane identity corresponding to the product rule 
for the Lie derivative of an interior product: 


Cy (iyUi) ijC^yUi T ty (DyU^) . (7.35) 

And there is a rather nice identity for the Lie derivative in terms 
of the interior product and the exterior derivative, called Cartan’s 
formula: 

HyUi 


* v (dcj) + d(z v cd). 


(7.36) 
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We can verify Cartan’s formula in a simple case with a program: 

(define X (literal-vector-field ’X-rect R3-rect)) 

(define Y (literal-vector-field ’Y-rect R3-rect)) 

(define Z (literal-vector-field ’Z-rect R3-rect)) 

(define a (literal -manifold-function ’alpha R3-rect)) 

(define b (literal -manifold-function ’beta R3-rect)) 

(define c (literal -manifold-function ’gamma R3-rect)) 

(define omega 

(+ (* a (wedge dx dy)) 

(* b (wedge dy dz)) 

(* c (wedge dz dx)))) 

(define ((LI X) omega) 

(+ ((interior-product X) (d omega)) 

(d ((interior-product X) omega)))) 

((- (((Lie-derivative X) omega) Y Z) 

(((LI X) omega) Y Z)) 

((point R3-rect) (up ’xO ’yO ’zO))) 

0 


Note that i v o i u + i u o i v = 0. One consequence of this is that 

*v°»v = 0. 

7.2 Covariant Derivative 

The covariant derivative is another kind of directional derivative 
operator. We write the covariant derivative operator with respect 
to a vector held v as V v . This is pronounced “covariant derivative 
with respect to v” or “nabla v.” 

Covariant Derivative of Vector Fields 

We may also choose our u to define what we mean by “parallel” 
transport of the vector held u along an integral curve of the vector 
held v. This may correspond to our usual understanding of parallel 
in situations where we have intuitive insight. 

The notion of parallel transport is path dependent. Remember 
our example from the Introduction, page 1: Start at the North 
Pole carrying a stick along a line of longitude to the Equator, 
always pointing it south, parallel to the surface of the Earth. Then 
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proceed eastward for some distance, still pointing the stick south. 
Finally, return to the North Pole along this new line of longitude, 
keeping the stick pointing south all the time. At the pole the stick 
will not point in the same direction as it did at the beginning of the 
trip, and the discrepancy will depend on the amount of eastward 
motion. 5 

So if we try to carry a stick parallel to itself and tangent to the 
sphere, around a closed path, the stick generally does not end up 
pointing in the same direction as it started. The result of carrying 
the stick from one point on the sphere to another depends on the 
path taken. However, the direction of the stick at the endpoint 
of a path does not depend on the rate of transport, just on the 
particular path on which it is carried. Parallel transport over a 
zero-length path is the identity. 

A vector may be resolved as a linear combination of other vec- 
tors. If we parallel-transport each component, and form the same 
linear combination, we get the transported original vector. Thus 
parallel transport on a particular path for a particular distance is 
a linear operation. 

So the transport function FY is a linear operator on the com- 
ponents of its argument, and thus 

W)( m) = X>5(<S)(ui 0 ^(OXm) (7.37) 

for some functions At that depend on the particular path (hence 
its tangent vector v) and the initial point. We reach back along the 
integral curve to pick up the components of u and then parallel- 
transport them forward by the matrix At (5) to form the compo- 
nents of the parallel-transported vector at the advanced point. 

As before, we compute 


V v u(f)(m) = Dg( 0), 

(7.38) 

where 


g{5) = u(f)(m) - (F 5 v u) (f)(m). 

(7.39) 


s In the introduction the stick was always kept east-west rather than pointing 
south, but the phenomenon is the same! 
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Expanding with respect to a basis {e*} we get 


9(5) = J2 ( “ E^S( 5 )( uJ °4 > -s) e i( f ) ) ( m )- ( 7 - 40 ) 


By the product rule for derivatives, 
Dg(6 ) = 


(7.41) 


E W ( S )(« uj )) ° « e *( f ) - DA)(6)(ui o 0 v _ 5 )e,(f)) (m). 

ij 

So, since A)(0)(m) is the identity multiplier, and is the identity 
function, 

Dg( o) = E ^ v ( u *)( m ) e *( f ) _ E L>j4 K°) uJ ( m ) e *( f )j ( m )- ( 7 - 42 ) 


We need DA'-(0). Parallel transport depends on the path, but 
not on the parameterization of the path. From this we can deduce 
that DA- (0) can be written as one- form fields applied to the vector 
field v, as follows. 

Introduce B to make the dependence of As on v explicit: 


A){5) = B){m){5). 

(7.43) 

Parallel transport depends on the path but not on 
the path. Incrementally, if we scale the vector field 

the rate along 
v by £, 


(7.44) 

Using the chain rule 


D{B(y)){5) = 1 D (B(&))( S -), 

(7.45) 

so, for 5 = 0, 


ZD(B(y))(0) = D(B(&))(0). 

(7.46) 


The scale factor £ can vary from place to place. So DA-(0) is 
homogeneous in v over manifold functions. This is stronger than 
the homogeneity required by equation (7.7). 
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The superposition property (equation (7.6)) is true of the ordi- 
nary directional derivative of manifold functions. By analogy we 
require it to be true of directional derivatives of vector fields. 
These two properties imply that DA 1 - (0) is a one- form field: 

DAj (0) = -«7j(v), (7.47) 

where the minus sign is a matter of convention. 

As before, we can take a stab at computing the covariant deriva- 
tive of a vector field by supplying an appropriate transport opera- 
tor for F in F->directional-derivative. Again, this is expanded 
to a given order with a given coordinate system. These will be 
unnecessary in the final version. 

(define (covariant-derivative-vector omega coordsys order) 

(let ((Phi (phi coordsys order))) 

(F->directional-derivative 

(F-parallel omega Phi coordsys)))) 

(define ((((((F-parallel omega phi coordsys) v) delta) u) f) m) 
(let ((basis (coordinate-system- >basis coordsys))) 

(let ((etilde (basis->lf orm-basis basis)) 

(e (basis->vector-basis basis))) 

(let ( (mO (((phi v) (- delta)) m))) 

(let ((Aij (+ (identity-like ((omega v) mO)) 

(* delta (- ((omega v) mO))))) 

(ui ((etilde u) mO))) 

(* ((e f) m) (* Aij ui))))))) 

So 

D d( 0) = J2 ^v(u’)(m) + ^OT*(v)(m)u J (m)j e *( f )( m )- (7-48) 

Thus the covariant derivative is 

Vvu(o = + e *( f )- O 7 - 49 ) 

The one-form fields vj 1 - are called the Cartan one-forms, or the 
connection one-forms. They are defined with respect to the ba- 


sis e. 
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As a program, the covariant derivative is: 6 

(define ((((covariant-derivative-vector Cartan) V) U) f) 

(let ((basis (Cartan->basis Cartan)) 

(Cartan-f orms (Cartan->f orms Cartan))) 

(let ((vector-basis (basis->vector-basis basis)) 

(If orm-basis (basis->lf orm-basis basis))) 

(let ((u-components (If orm-basis U))) 

(* (vector-basis f) 

(+ (V u-components) 

(* (Cartan-f orms V) u-components))))))) 

An important property of V v u is that it is linear over manifold 
functions g in the first argument 

Vgyu(f) = gV v u(f), (7.50) 

consistent with the fact that the Cartan forms vjt share the same 
property. 

Additionally, we can extend the product rule, for any manifold 
function g and any vector field u: 

Vv(gu)(f) = J2 ^ v (g u *) + Y ^( v ) e *( f ) 

= 'Y v(s)u l e«(f) + gV v (u)(f) 

i 

= (V v g)u(f) + gV v (u)(f). (7.51) 

An Alternate View 

As we did with the Lie derivative (equations 7.18-7.21), we can 
write the vector field 

u (0( m ) = Y u *( m ) e i(f)( m )- (7-52) 

i 

By the extended product rule, equation (7.51), we get: 

V v u(f) = ^(v(u i )e i (f) + u < V v e i (f)). (7.53) 


6 This program is incomplete. It must construct a vector field; it must make a 
differential operator; and it does not apply to functions or forms. 
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Because the covariant derivative of a vector field is a vector field 
we can extract the components of V v e.; using the dual basis: 

zuj(v) = e* (V v ej) . (7.54) 

This gives an alternate expression for the Cartan one forms. So 

Vvu(f) = Y ^ v ( u? ) + Y ^( v ) u ^ e *( f )- ( 7 - 55 ) 

This analysis is parallel to the analysis of the Lie derivative, except 
that here we have the Cartan form fields tc) and there we had A), 
which are not form fields. 

Notice that the Cartan forms appear here (equation 7.53) in 
terms of the covariant derivatives of the basis vectors. By contrast, 
in the first derivation (see equation 7.42) the Cartan forms appear 
as the derivatives of the linear forms that accomplish the parallel 
transport of the coefficients. 

The Cartan forms can be constructed from the dual basis one- 
forms: 

^.( v )( m ) = Y rj-fcCm) e fe (v)(m). (7.56) 

k 

The connection coefficient functions T l - k are called the Christoffel 
coefficients (traditionally called Christoffel symbols). 7 Making use 
of the structures, 8 the Cartan forms are 

vs (v) = Te(v). (7.57) 

Conversely, the Christoffel coefficients may be obtained from the 
Cartan forms 

T) k = w}(e*). (7.58) 


7 This terminology may be restricted to the case in which the basis is a coor- 
dinate basis. 

8 The structure of the Cartan forms «7 together with this equation forces the 
shape of the Christoffel coefficient structure. 
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Covariant Derivative of One-Form Fields 


The covariant derivative of a vector field induces a compatible 
covariant derivative for a one-form field. Because the application 
of a one-form field to a vector field yields a manifold function, we 
can evaluate the covariant derivative of such an application. Let 
r be a one-form field and w be a vector field. Then 


V v (r(w)) 



= ^ (v(Tj)w J + TjV( w J )) 
3 



So if we define the covariant derivative of a one-form field to be 
V v( r ) = ^ v ( T k) “ (7.59) 

then the generalized product rule holds: 

V v (r(u)) = (V v r)(u) + r(V v u). (7.60) 


Alternatively, assuming the generalized product rule forces the 
definition of covariant derivative of a one-form field. 

As a program this is 

(define ( ( ( (covariant-derivative-lf orm Cartan) V) tau) U) 

(let ((nabla_V ((covariant-derivative-vector Cartan) V))) 

(- (V (tau U)) (tau (nabla_V U))))) 
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This program extends naturally to higher-rank form fields: 

(define ((((covariant-derivative-form Cartan) V) tau) vs) 
(let ((k (get-rank tau)) 

(nabla_V ((covariant-derivative-vector Cartan) V))) 
(- (V (apply tau vs)) 

(sigma (lambda (i) 

(apply tau 

(list-with-substituted-coord vs i 
(nabla_V (list-ref vs i))))) 

0 (- k 1))))) 


Change of Basis 

The basis-independence of the covariant derivative implies a re- 
lationship between the Cartan forms in one basis and the equiv- 
alent Cartan forms in another basis. Recall (equation 4.13) that 
the basis vector fields of two bases are always related by a linear 
transformation. Let J be the matrix of coefficient functions and 
let e and e ; be down tuples of basis vector fields. Then 

e(f) = e'(f)J. (7.61) 


We want the covariant derivative to be independent of basis. 
This will determine how the connection transforms with a change 
of basis: 


Vvu(f) = e *( f ) ^ v ( u ‘) + ^K v ) uJ j 

= £ e S( f ) J i- (v ((j-bKuy) + £ w{(v)(j- i )i(u') 1 


ijk 


.((u'n + ^JXtJ- 1 )',) (u')‘ 

i \ jk 

+ £ji<(v)(j- 1 ))(u')‘) 
jkl ) 

5>'(f) fv«u'r) +^(w')-(v)(u')q. 


(7.62) 
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The last line of equation (7.62) gives the formula for the covariant 
derivative we would have written down naturally in the primed 
coordinates; comparing with the next-to-last line, we see that 

vj'(y) = Jv (J^ 1 ) + J-za7(v)J -1 . (7.63) 

This transformation rule is weird. It is not a linear transformation 
of vj because the first term is an offset that depends on v. So it 
is not required that vj' = 0 when vj = 0. Thus vj is not a tensor 
field. See Appendix C. 

We can write equation (7.61) in terms of components 

e*( f ) = XX'( f ) J V ( 7 - 64 ) 

j 

Let K = J^ 1 , so Kh( m )J- ? fc (m) = 8\. Then 

~i‘M = E J i ' , ( K b + E Jj "t-M K*|. (7.65) 

3 jk 

The transformation rule for zu is implemented in the following 
program: 

(define (Cartan-transf orm Cartan basis-prime) 

(let ((basis (Cartan->basis Cartan)) 

(forms (Cartan->f orms Cartan)) 

(prime-dual-basis (basis->lf orm-basis basis-prime)) 
(prime-vector-basis (basis->vector-basis basis-prime))) 
(let ((vector-basis (basis->vector-basis basis)) 

(If orm-basis (basis->lf orm-basis basis))) 

(let ((J-inv (s:map/r If orm-basis prime-vector-basis)) 

(J (s:map/r prime-dual-basis vector-basis))) 

(let ((omega-prime-forms 

(procedure->lf orm-f ield 
(lambda (v) 

(+ (* J (v J-inv)) 

(* J (* (forms v) J-inv))))))) 

(make-Cartan omega-prime-forms basis-prime)))))) 

The s:map/r procedure constructs a tuple of the same shape as 
its second argument whose elements are the result of applying 
the first argument to the corresponding elements of the second 
argument. 
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We can illustrate that the covariant derivative is independent of 
the coordinate system in a simple case, using rectangular and polar 
coordinates in the plane. 9 We can choose Christoffel coefficients 
for rectangular coordinates that are all zero: 10 

(define R2-rect-Christoff el 
(make-Christoffel 
(let ((zero (lambda (m) 0))) 

(down (down (up zero zero) 

(up zero zero)) 

(down (up zero zero) 

(up zero zero)))) 

R2-rect-basis) ) 

With these Christoffel coefficients, parallel transport preserves the 
components relative to the rectangular basis. This corresponds to 
our usual notion of parallel in the plane. We will see later in 
Chapter 9 that these Christoffel coefficients are a natural choice 
for the plane. From these we obtain the Cartan form: 11 

(define R2-rect-Cartan 

(Christoff el->Cartan R2-rect-Christof f el) ) 

And from equation (7.63) we can get the corresponding Cartan 
form for polar coordinates: 

(define R2-polar-Cartan 

(Cartan-transf orm R2-rect-Cartan R2-polar-basis) ) 


9 We will need a few definitions: 

(define R2-rect-basis (coordinate-system->basis R2-rect)) 

(define R2-polar- basis (coordinate-system->basis R2-polar)) 

(def ine-coordinates (up x y) R2-rect) 

(def ine-coordinates (up r theta) R2-polar) 

10 Since the Christoffel coefficients are basis-dependent they are packaged with 
the basis. 

11 The code for making the Cartan forms is as follows: 

(define (Christoff el->Cartan Christoffel) 

(let ((basis (Christoff el->basis Christoffel)) 

(Christoff el-symbols (Christoff el->symbols Christoffel))) 
(make-Cart an 

(* Christoff el-symbols (basis->lf orm-basis basis)) 
basis) ) ) 
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The vector field d/dO generates a rotation in the plane (the 
same as circular). The covariant derivative with respect to d/dx 
of d/ 36 applied to an arbitrary manifold function is: 

(define circular (- (* x d/dy) (* y d/dx))) 

(define f (literal -manifold-function ’f-rect R2-rect)) 

(define R2-rect-point ((point R2-rect) (up ’xO ’yO))) 

(((((covariant-derivative R2-rect-Cartan) d/dx) 
circular) 
f ) 

R2-rect-point) 

(((partial 1) f-rect) (up xO yO ) ) 

Note that this is the same thing as d/dy applied to the function: 

((d/dy f) R2-rect-point) 

(( (partial 1) f-rect) (up xO yO)) 

In rectangular coordinates, where the Christoffel coefficients are 
zero, the covariant derivative V u v is the vector whose coefficents 
are obtained by applying u to the coefficients of v. Here, only one 
coefficient of d/dO depends on x. the coefficient of d/dy , and it 
depends linearly on x. So \7g/g x d/d0 = d/dy. (See figure 7.1.) 

Note that we get the same answer if we use polar coordinates 
to compute the covariant derivative: 

(((((covariant-derivative R2-polar-Cartan) d/dx) J) f) 
R2-rect-point) 

(((partial 1) f-rect) (up xO yO ) ) 

In rectangular coordinates the Christoffel coefficients are all zero; 
in polar coordinates there are nonzero coefficients, but the value 
of the covariant derivative is the same. In polar coordinates the 
basis elements vary with position, and the Christoffel coefficients 
compensate for this. 

Of course, this is a pretty special situation. Let’s try something 
more general: 
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Figure 7.1 If v and v' are “arrow” representations of vectors in the 
circular field and we parallel-transport v in the d/dx direction, then 
the difference between v' and the parallel transport of v is in the d/dy 
direction. 


(define V (literal-vector-field ’V-rect R2-rect)) 
(define W (literal-vector-field ’W-rect R2-rect)) 

(((((- (covariant-derivative R2-rect-Cartan) 

(covariant-derivative R2-polar-Cartan) ) 

V) 

W) 
f ) 

R2-rect-point) 

0 


7.3 Parallel Transport 

We have defined parallel transport of a vector field along integral 
curves of another vector field. But not all paths are integral curves 
of a vector field. For example, paths that cross themselves are not 
integral curves of any vector held. 

Here we extend the idea of parallel transport of a stick to make 
sense for arbitrary paths on the manifold. Any path can be written 
as a map 7 from the real-line manifold to the manifold M. We 
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construct a vector field over the map u 7 by parallel-transporting 
the stick to all points on the path 7 . 

For any path 7 there are locally directional derivatives of func- 
tions on M defined by tangent vectors to the curve. The vector 
over the map w 7 = d^{d/dt) is a directional derivative of functions 
on the manifold M along the path 7 . 

Our goal is to determine the equations satisfied by the vector 
field over the map u 7 . Consider the parallel-transport u 7 . 12 
So a vector field u 7 is parallel-transported to itself if and only 
if u 7 = F" 7 u 7 . Restricted to a path, the equation analogous to 
equation (7.40) is 

9^) = ^“*(0 -^ J A )( 5 ) u3 ( t ~ e 7( f )(t)> (7-66) 

where the coefficient function u l is now a function on the real-line 
parameter manifold and where we have rewritten the basis as a 
basis over the map 7 . 13 Here g{8) = 0 if u 7 is parallel-transported 
into itself. 

Taking the derivative and setting 6 = 0 we find 

0 = e 7( f )W- (7-67) 

But this implies that 

0 = Du l (t ) + ^ 7 -c£ 7 *(w 7 )(t )u^(t), (7.68) 

3 

an ordinary differential equation in the coefficients of u 7 . 


12 The argument w 7 makes sense because our parallel-transport operator never 
depended on the vector field tangent to the integral curve existing off of the 
curve. Because the connection is a form field (see equation 7.47), it does not 
depend on the value of its vector argument anywhere except at the point where 
it is being evaluated. 

The argument u 7 is more difficult. We must modify equation (7.37): 

F> 7 (f)(t) =J2A)(6)u\t - S)e] (f)(t). 
i,j 

13 You may have noticed that t and t appear here. The real-line manifold point 
t has coordinate t. 
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We can abstract these equations of parallel transport by invent- 
ing a covariant derivative over a map. We also generalize the time 
line to a source manifold N. 

V> 7 (f)(n) 

= Y ^ v (“*)( n ) + Y 7 ^i( d 7 (v))(n)n J (n)^ ej(f)(n), (7.69) 

where the map 7 : N — »• M, v is a vector on N, u 7 is a vector over 
the map 7, f is a function on M, and n is a point in N. Indeed, 
if w is a vector field on M, f is a manifold function on M, and if 
d'y(y) = w 7 then 

V> 7 (f)(n) = V w u(f)( 7 (n)). (7.70) 

This is why we are justified in calling Vv a covariant derivative. 

Respecializing the source manifold to the real line, we can write 
the equations governing the parallel transport of u 7 as 

vj /a u, = 0. (7.71) 

We obtain the set of differential equations ( 7 . 68 ) for the coordi- 
nates of u 7 , the vector over the map 7, that is parallel-transported 
along the curve 7: 

Du l (t.) + Y^ 1 Tzr)(d'y(d / dt))(t)ui (t) = 0. (7.72) 

j 

Expressing the Cartan forms in terms of the Christoffel coefficients 
we obtain 

Du\t) + ^rj fe ( 7 (t ))Da k (t)u\t) = 0 (7.73) 

j,k 

where cr = x M 070 are the coordinates of the path (y M and 
X R are the coordinate functions for M and the real line). 

On a Sphere 

Let’s figure out what the equations of parallel transport of u 7 , an 
arbitrary vector over the map 7, along an arbitrary path 7 on a 
sphere are. We start by constructing the necessary manifold. 
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(define sphere (make-manifold S~2 23)) 

(define S2-spherical 

(coordinate-system-at ’spherical ’north-pole sphere)) 

(define S2-basis 

(coordinate-system->basis S2-spherical) ) 

We need the path 7, which we represent as a map from the real 
line to M, and w, the parallel-transported vector over the map: 

(define gamma 

(compose (point S2-spherical) 

(up (literal-function ’alpha) 

(literal-function ’beta)) 

(chart Rl-rect))) 

where alpha is the colatitude and beta is the longitude. 

We also need an arbitrary vector field u_gamma over the map 
gamma. To make this we multiply the structure of literal compo- 
nent functions by the vector basis structure. 

(define basis-over-gamma 

(basis->basis-over-map gamma S2-basis)) 

(define u_gamma 

(* (up (compose (literal-function ’u~0) 

(chart Rl-rect)) 

(compose (literal-function ’u"l) 

(chart Rl-rect))) 

(basis->vector-basis basis-over-gamma) ) ) 

We specify a connection by giving the Christoffel coefficients. 14 

(define S2-Christoff el 
(make-Christoffel 
(let ((zero (lambda (point) 0))) 

(down (down (up zero zero) 

(up zero (/ 1 (tan theta)))) 

(down (up zero (/ 1 (tan theta))) 

(up (- (* (sin theta) (cos theta))) zero)))) 

S2-basis) ) 

(define sphere-Cartan (Christoff el->Cartan S2-Christof f el) ) 


14 We will show later that these Christoffel coefficients are a natural choice for 
the sphere. 
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Finally, we compute the residual of the equation (7.71) that gov- 
erns parallel transport for this situation: 15 

(def ine-coordinates t Rl-rect) 

(s :map/r 
(lambda (omega) 

( (omega 

(((covariant-derivative sphere-Cartan gamma) 
d/dt) 
u_gamma) ) 

((point Rl-rect) ’tau))) 

(basis->lf orm-basis basis-over-gamma) ) 

(up (+ (* -1 

(sin (alpha tau)) 

(cos (alpha tau)) 

( (D beta ) tau ) 

(u~l tau ) ) 

( (D u~0) tau ) ) 

(/ (+ (* (u~0 tau) (cos (alpha tau)) ( (D beta) tau)) 

(* ( (D alpha) tau) (cos (alpha tau)) (u~l tau)) 

(* ( (D u~l) tau) (sin (alpha tau)))) 

(sin (alpha tau)))) 

Thus the equations governing the evolution of the components of 
the transported vector are: 

Du°(t ) = sin(a(r)) cos(a(r))I7/3(r)u 1 (r), 

Dv}(t) = _ cos ( a ( T )) + Dol(t)u 1 (t)) . (7.74) 

sin(a(r)) 

These equations describe the transport on a sphere, but more 
generally they look like 

Du{t) = f{cr(r), Da(r)) u(t), (7.75) 

where a is the tuple of the coordinates of the path on the manifold 
and u is the tuple of the components of the vector. The equation 
is linear in u and is driven by the path a, as in a variational 
equation. 


15 If we give covariant-derivative an extra argument, in addition to the 
Cartan form, the covariant derivative treats the extra argument as a map and 
transforms the Cartan form to work over the map. 
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We now set this up for numerical integration. Let s(t) = 
(' t , u(t)) be a state tuple, combining the time and the coordinates 
of u 7 at that time. Then we define g\ 

g{s{t)) = Ds(t ) = (1 ,Du(t)), (7.76) 

where Du(t ) is the tuple of right-hand sides of equation (7.72). 

On a Great Circle 

We illustrate parallel transport in a case where we should know 
the answer: we carry a vector along a great circle of a sphere. 
Given a path and Cartan forms for the manifold we can produce 
a state derivative suitable for numerical integration. Such a state 
derivative takes a state and produces the derivative of the state. 

(define (g gamma Cartan) 

(let ((omega 

( (Cartan->f orms 

(Cartan->Cartan-over-map Cartan gamma)) 
((differential gamma) d/dt)))) 

(define ( (the-state-derivative) state) 

(let ((t ((point Rl-rect) (ref state 0))) 

(u (ref state 1))) 

(up 1 (* -1 (omega t) u) ) ) ) 

the-state-derivative) ) 

The path on the sphere will be the target of a map from the real 
line. We choose one that starts at the origin of longitudes on the 
equator and follows the great circle that makes a given tilt angle 
with the equator. 

(define ((transform tilt) coords) 

(let ((colat (ref coords 0)) 

(long (ref coords 1))) 

(let ((x (* (sin colat) (cos long))) 

(y (* (sin colat) (sin long))) 

(z (cos colat))) 

(let ((vp ((rotate-x tilt) (up x y z)))) 

(let ((colatp (acos (ref vp 2))) 

(longp (atan (ref vp 1) (ref vp 0)))) 

(up colatp longp)))))) 
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(define (tilted-path tilt) 

(define (coords t) 

((transform tilt) (up :pi/2 t))) 

(compose (point S2-spherical) 
coords 

(chart Rl-rect))) 

A southward pointing vector, with components (up 1 0) , is trans- 
formed to an initial vector for the tilted path by multiplying by 
the derivative of the tilt transform at the initial point. We then 
parallel transport this vector by numerically integrating the dif- 
ferential equations. In this example we tilt by 1 radian, and we 
advance for n/2 radians. In this case we know the answer: by 
advancing by ir/2 we walk around the circle a quarter of the way 
and at that point the transported vector points south: 

((state-advancer (g (tilted-path 1) sphere-Cartan) ) 

(up 0 (* ( (D (transform 1)) (up :pi/2 0)) (up 1 0))) 
pi/2) 

(up 1.5707963267948957 

(up .9999999999997626 7 . 3 763 785225582 62e-13)) 

However, if we transport by 1 radian rather than 7r/2, the numbers 
are not so pleasant, and the transported vector no longer points 
south: 

((state-advancer (g (tilted-path 1) sphere-Cartan)) 

(up 0 (* ((D (transform 1)) (up :pi/2 0)) (up 1 0))) 

1 ) 

(up 1. (up .7651502649360408 . 9117920272006472 ) ) 

But the transported vector can be obtained by tilting the orig- 
inal southward-pointing vector after parallel-transporting along 
the equator: 16 

(* ((D (transform 1)) (up :pi/2 1)) (up 10)) 

(up .7651502649370375 . 9117920272004736 ) 


16 A southward-pointing vector remains southward-pointing when it is parallel- 
transported along the equator. To do this we do not have to integrate the 
differential equations, because we know the answer. 
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7.4 Geodesic Motion 

In geodesic motion the velocity vector is parallel-transported by 
itself. Recall (equation 6.9) that the velocity is the differential of 
the vector d/dt over the map 7 . The equation of geodesic motion 


v a/aA(<9/<9t) = 0 . 

(7.78) 

In coordinates, this is 


D 2 a\t) + £ T) k (^(t))D^(t)Da k (t) = 0, 

(7.79) 

jk 



where a (t) is the coordinate path corresponding to the manifold 
path 7 . 

For example, let’s consider geodesic motion on the surface of 
a unit sphere. We let gamma be a map from the real line to the 
sphere, with colatitude alpha and longitude beta, as before. The 
geodesic equation is: 

(show-expression 

(((((covariant-derivative sphere-Cartan gamma) 
d/dt) 

((differential gamma) d/dt)) 

(chart S2-spherical) ) 

((point Rl-rect) ’tO))) 



/ — cos (a (tO)) sin (a (tO)) (D/3 (tO )) 2 + D 2 a (tO) \ 

2D/3 (tO) cos (a (tO)) Da (tO) n2o ,^ 
l • / u\\ +D(i(t 0) J 

\ sm (a (t)) / 




17 The equation of a geodesic path is often said to be 

V v v = 0, (7.77) 

but this is nonsense. The geodesic equation is a constraint on the path, but 
the path does not appear in this equation. Further, the velocity along a path 
is not a vector field, so it cannot appear in either argument to the covariant 
derivative. 

What is true is that a vector field v all of whose integral curves are geodesics 
satisfies equation (7.77). 
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The geodesic equation is the same as the Lagrange equation for 
free motion constrained to the surface of the unit sphere. The 
Lagrangian for motion on the sphere is the composition of the 
free-particle Lagrangian and the state transformation induced by 
the coordinate constraint: 18 

(define (Lfree s) 

(* 1/2 (square (velocity s)))) 

(define (sphere->R3 s) 

(let ((q (coordinate s))) 

(let ((theta (ref q 0)) (phi (ref q 1))) 

(up (* (sin theta) (cos phi)) 

(* (sin theta) (sin phi)) 

(cos theta))))) 

(define Lsphere 

(compose Lfree (F->C sphere->R3) ) ) 

Then the Lagrange equations are: 

(show-expression 
( ( (Lagrange-equations Lsphere) 

(up (literal-function ’alpha) 

(literal-function ’beta))) 

’t)) 


— (D/3 (i)) 2 sin (a (t)) cos (a (t)) + D 2 a (t) 

. 2 Da ( t ) D/3 (t) sin (a (t)) cos (a (t)) + D 2 /3 ( t ) (sin (a ( t ))) 2 . 

The Lagrange equations are true of the same paths as the geodesic 
equations. The second Lagrange equation is the second geodesic 
equation multiplied by (sin(a(t))) 2 , and the Lagrange equations 
are arranged in a down tuple, whereas the geodesic equations are 
arranged in an up tuple. 19 The two systems are equivalent unless 
a(t) = 0, where the coordinate system is singular. 


18 The method of formulating a system with constraints by composing a free 
system with the state-space coordinate transformation that represents the 
constraints can be found in [19], Section 1.6.3. The procedure F->C takes 
a coordinate transformation and produces a corresponding transformation of 
Lagrangian state. 

19 The geodesic equations and the Lagrange equations are related by a con- 
traction with the metric. 
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Exercise 7.1: Hamiltonian Evolution 

We have just seen that the Lagrange equations for the motion of a free 
particle constrained to the surface of a sphere determine the geodesics 
on the sphere. We can investigate this phenomenon in the Hamiltonian 
formulation. The Hamiltonian is obtained from the Lagrangian by a 
Legendre transformation: 

(define Hsphere 

(Lagrangian->Hamiltonian Lsphere)) 

We can get the coordinate representation of the Hamiltonian vector field 
as follows: 

( (phase-space-derivative Hsphere) 

(up ’t (up ’theta ’phi) (down ’p.theta ’p_phi))) 

(up 1 

(up p.theta 

(/ p-phi (expt (sin theta) 2))) 

(down (/ (* (expt p.phi 2) (cos theta)) 

(expt (sin theta) 3)) 

0 )) 

The state space for Hamiltonian evolution has five dimensions: time, two 
dimensions of position on the sphere, and two dimensions of momentum: 

(define state-space 

(make-manifold R“n 5)) 

(define states 

(coordinate-system-at ’rectangular ’origin state-space)) 

(def ine-coordinates 

(up t (up theta phi) (down p_theta p_phi)) 
states) 

So now we have coordinate functions and the coordinate-basis vector 
fields and coordinate-basis one-form fields. 

a. Define the Hamiltonian vector field as a linear combination of these 
fields. 

b. Obtain the first few terms of the Taylor series for the evolution of 
the coordinates ( 9 , (j ) ) by exponentiating the Lie derivative of the Hamil- 
tonian vector field. 

Exercise 7.2: Lie Derivative and Covariant Derivative 

How are the Lie derivative and the covariant derivative related? 

a. Prove that for every vector field there exists a connection such that 
the covariant derivative for that connection and the given vector field is 
equivalent to the Lie derivative with respect to that vector field. 

b. Show that there is no connection that for every vector field makes 
the Lie derivative the same as the covariant derivative with the chosen 
connection. 





8 

Curvature 


If the intrinsic curvature of a manifold is not zero, a vector parallel- 
transported around a small loop will end up different from the 
vector that started. We saw the consequence of this before, on 
page 1 and on page 93. The Riemann tensor encapsulates this 
idea. 

The Riemann curvature operator is 
7e(w,v) = [Vw,V v ] - V [WiV] . (8.1) 

The traditional Riemann tensor is 1 

R(w,u,w,v) = w((ft(w,v))(u)), (8.2) 

where a; is a one-form field that measures the incremental change 
in the vector field u caused by parallel-transporting it around the 
loop defined by the vector fields w and v. R allows us to compute 
the intrinsic curvature of a manifold at a point. 

The Riemann curvature is computed by 

(define ( (Riemann-curvature nabla) w v) 

(- (commutator (nabla w) (nabla v) ) 

(nabla (commutator w v) ) ) ) 

The Riemann-curvature procedure is parameterized by the rel- 
evant covariant-derivative operator nabla, which implements V. 
The nabla is itself dependent on the connection, which provides 
the details of the local geometry. The same Riemann-curvature 
procedure works for ordinary covariant derivatives and for covari- 
ant derivatives over a map. Given two vector fields, the result 
of ( (Riemann-curvature nabla) w v) is a procedure that takes a 
vector field and produces a vector field so we can implement the 
Riemann tensor as 


1 [11], [4], and [14] use our definition. [20] uses a different convention for the 
order of arguments and a different sign. See Appendix C for a definition of 
tensors. 
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(define ((Riemann nabla) omega u w v) 

(omega ( ( (Riemann-curvature nabla) w v) u))) 

So, for example, 2 

(((Riemann (covariant-derivative sphere-Cartan) ) 
dphi d/dtheta d/dphi d/dtheta) 

((point S2-spherical) (up ’thetaO ’phiO))) 

1 

Here we have computed the 4> component of the result of carrying 
a d/ d6 basis vector around the parallelogram defined by d/d<j) and 
d/d6. The result shows a net rotation in the </> direction. 

Most of the sixteen coefficients of the Riemann tensor for the 
sphere are zero. The following are the nonzero coefficients: 

R (" 444 ) (x " v ’ /)) = (sin(, '' ,))2 ’ 

R ("• |44) (X “ 1(,V)) = - (sin(</)) 2 , 

R 4444) = _1, 

r 4444) (x " 1(, ' , * ))=i - (8 - 3) 

8.1 Explicit Transport 

We will show that the result of the Riemann calculation of the 
change in a vector, as we traverse a loop, is what we get by ex- 
plicitly calculating the transport. The coordinates of the vector 
to be transported are governed by the differential equations (see 
equation 7.72) 

Du\t)= -X)w5(v)(x -1 (<r(t))V(i) (8.4) 

3 


2 The connection specified by sphere-Cartan is defined on page 107. 
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and the coordinates as a function of time, a = x ° 7 ° Xr\ of the 
path 7 , are governed by the differential equations 3 

Da(t) =v( X )(x~ 1 (v(t))). (8.5) 

We have to integrate these equations (8.4, 8.5) together to trans- 
port the vector over the map u 7 a finite distance along the vector 
field v. 

Let s(t) = ( a(t),u(t )) be a state tuple, combining a the coor- 
dinates of 7 , and u the coordinates of u 7 . Then 

D.s(t) = (. Da(t ), Du{t)) = g(s(t )), ( 8 . 6 ) 

where g is the tuple of right-hand sides of equations (8.4, 8.5). 

The differential equations describing the evolution of a function 
h of state s along the state path are 

D(h o s) = ( Dh o s)(g o s) = L g h o s, (8-7) 

defining the operator L g . 

Exponentiation gives a finite evolution. 

h(s(t + e)) = ( e eLg h)(s(t )). ( 8 . 8 ) 

The finite parallel transport of the vector with components u is 

u(t + e) = (e eLg U)(s(t)), (8.9) 

where the selector U(a,u ) = u, and the initial state is s(t) = 
(< cr(t),u(t )). 

Consider parallel-transporting a vector u around a parallelo- 
gram defined by two coordinate-basis vector fields w and v. The 
vector u is really a vector over a map, where the map is the para- 
metric curve describing our parallelogram. This map is implicitly 
defined in terms of the vector fields w and v. Let g w and g v be the 
right-hand sides of the differential equations for parallel transport 


3 The map 7 takes points on the real line to points on the target manifold. 
The chart x gives coordinates of points on the target manifold while xr gives 
a time coordinate on the real line. 

4 The series may not converge for large increments in the independent variable. 
In this case it is appropriate to numerically integrate the differential equations 
directly. 
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along w and v respectively. Then evolution along w for interval 
e, then along v for interval e, then reversing w, and reversing v, 
brings a back to where it started to second order in e. 

The state s = (a, u) after transporting so around the loop is 5 

( e~ eLgv I ) o (e~ eLg ™ I) o (e eLgv I) o (e eLgw I)(sq) 

= (e eLgw e eLgv e~ eLgw e~ eLgv I)(sq) 

= (e e2 ^ Lgw ’ Lg ^ + ' I)(so). (8.10) 

So the lowest-order change in the transported vector is 
e 2 U(([L gw ,L gv ]I)(s 0 )), (8.11) 

where U(a , u) = u. 

However, if w and v do not commute, the indicated loop does 
not bring cr back to the starting point, to second order in e. We 
must account for the commutator. (See figure 4.2.) In the general 
case the lowest order change in the transported vector is 

e 2 U((([Lg w ,Lg v }-Lg M )I)(s 0 )). (8.12) 

This is what the Riemann tensor computation gives, scaled by e 2 . 

Verification in Two Dimensions 

We can verify this in two dimensions. We need to make the struc- 
ture representing a state: 

(define (make-state sigma u) (vector sigma u) ) 

(define (Sigma state) (ref state 0)) 

(define (U-select state) (ref state 1)) 


5 The parallel-transport operators are evolution operators, and therefore de- 
scend into composition: 

e A (F o G) = Fo(e A G), 

for any state function G and any compatible F. As a consequence, we have 
the following identity: 

e A e B I = e A ((e B I) o I) = (e B I) o ( e A I ), 

where / is the identity function on states. 
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And now we get to the meat of the matter: First we find the rate 
of change of the components of the vector u as we carry it along 
the vector field v. h 

(define ((Du v) state) 

(let ((CF (Cartan->f orms general-Cartan-2) ) ) 

(* -1 

((CF v) (Chi-inverse (Sigma state))) 

(U-select state)))) 

We also need to determine the rate of change of the coordinates 
of the integral curve of v. 

(define ((Dsigma v) state) 

((v Chi) (Chi-inverse (Sigma state)))) 

Putting these together to make the derivative of the state vector 

(define ((g v) state) 

(make-state ((Dsigma v) state) ((Du v) state))) 

gives us just what we need to construct the differential operator 
for evolution of the combined state: 

(define (L v) 

(define ((1 h) state) 

(* ((D h) state) ((g v) state))) 

(make-operator 1)) 

So now we can demonstrate that the lowest-order change re- 
sulting from explicit parallel transport of a vector around an in- 
finitesimal loop is what is computed by the Riemann curvature. 


6 The setup for this experiment is a bit complicated. We need to make a 
manifold with a general connection. 

(define Chi-inverse (point R2-rect)) 

(define Chi (chart R2-rect)) 

We now make the Cartan forms from the most general 2-dimensional 
Christoffel coefficient structure: 

(define general-Cartan-2 
(Christoff el->Cart an 

(literal-Christof f el-2 ’Gamma R2-rect))) 




120 


Chapter 8 Curvature 


(let ( (U (literal-vector-field ’U-rect R2-rect)) 

(W (literal-vector-field ’W-rect R2-rect)) 

(V (literal-vector-field ’V-rect R2-rect)) 

(sigma (up ’sigmaO ’sigmal))) 

(let ((nabla (covariant-derivative general-Cartan-2) ) 

(m (Chi-inverse sigma))) 

(let ((s (make-state sigma ((U Chi) m) ) ) ) 

(- (((- (commutator (L V) (L W)) 

(L (commutator V W))) 

U-select) 

s) 

( ( ( ( (Riemann-curvature nabla) W V) U) Chi) m) ) ) ) ) 

(up 0 0) 

Geometrically 

The explicit transport above was done with differential equations 
operating on a state consisting of coordinates and components of 
the vector being transported. We can simplify this so that it is 
entirely built on manifold objects, eliminating the state. After a 
long algebraic story we find that 

((7£(w, v))(u))(f) 

= e(f) {(w(-ct7(v)) — v(ot(w)) — zu ([\ n , v]))e(u) 

+'CC7(w)'Zx7(v)e(u) — 'CC7(v)'!A7(w)e(u)} ( 8 . 13 ) 


or as a program: 

(define ( ( ( (curvature-f rom-transport Cartan) w v) u) f) 
(let* ((CF (Cartan->f orms Cartan)) 

(basis (Cartan->basis Cartan)) 

(fi (basis->lf orm-basis basis)) 

(ei (basis->vector-basis basis))) 

(* (ei f) 

(+ (* (- (- (w (CF v)) (v (CF w) ) ) 

(CF (commutator w v) ) ) 

(fi u)) 

(- (* (CF w) (* (CF v) (fi u))) 

(* (CF v) (* (CF w) (fi u)))))))) 


This computes the same operator as the traditional Riemann cur- 
vature operator: 
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(define (test coordsys Cartan) 

(let ((m (typical-point coordsys)) 

(u (literal-vector-field ’u-coord coordsys)) 

(w (literal-vector-field ’w-coord coordsys)) 

(v (literal-vector-field ’v-coord coordsys)) 

(f (literal -manifold-function ’f-coord coordsys))) 

(let ((nabla (covariant-derivative Cartan))) 

(- ( ( ( ( (curvature-from-transport Cartan) w v) u) f) m) 

( ( ( ( (Riemann-curvature nabla) w v) u) f) m) ) ) ) ) 

(test R2-rect general-Cartan-2) 

0 

(test R2-polar general-Cartan-2) 

0 

Terms of the Riemann Curvature 

Since the Riemann curvature is defined as in equation (8.1), 

7£(w,v) = [V w ,V v ]-V Kv] , (8.14) 

it is natural 7 to identify these terms with the corresponding terms 
in 

(([L 9m , L 9v \ — L g[w v] )U)(so). (8.15) 

Unfortunately, this does not work, as demonstrated below: 

(let ((U (literal-vector-field ’U-rect R2-rect)) 

(V (literal-vector-field ’V-rect R2-rect)) 

(W (literal-vector-field ’W-rect R2-rect)) 

(nabla (covariant-derivative general-Cartan-2)) 

(sigma (up ’ sigmaO ’sigmal))) 

(let ((m (Chi-inverse sigma))) 

(let ((s (make-state sigma ((U Chi) m) ) ) ) 

(- (((commutator (L W) (L V)) U-select) s) 

((((commutator (nabla W) (nabla V)) U) Chi) 
m))))) 

a nonzero mess 


7 People often say “Geodesic evolution is exponentiation of the covariant 
derivative.” But this is wrong. The evolution is by exponentiation of L g . 




122 


Chapter 8 Curvature 


The obvious identification does not work, but neither does the 
other one! 

(let ( (U (literal-vector-field ’U-rect R2-rect)) 

(V (literal-vector-field ’V-rect R2-rect)) 

(W (literal-vector-field ’W-rect R2-rect)) 

(nabla (covariant-derivative general-Cartan-2) ) 

(sigma (up ’ sigmaO ’sigmal))) 

(let ((m (Chi-inverse sigma))) 

(let ((s (make-state sigma ((U Chi) m) ) ) ) 

(- (((commutator (L W) (L V)) U-select) s) 

((((nabla (commutator W V)) U) Chi) 

m))))) 

a nonzero mess 


Let’s compute the two parts of the Riemann curvature operator 
and see how this works out. First, recall 

V v u(f) = 5>(f) ^v(e*(u)) + J]^( v )e J ( u )^ (8.16) 

= e(f)(v(e(u)) + w(v)e(u)), (8.17) 

where the second form uses tuple arithmetic. Now let’s consider 
the first part of the Riemann curvature operator: 

[V w ,V v ] u 

= V w V v u - V v V w u 

= e {w(v(e(u)) + tt7(v)e(u)) + tt7(w)(v(e(u)) + ^(v)e(u))} 

— e {v(w(e(u)) + tt7(w)e(u)) + tt7(v)(w(e(u)) + tt7(w)e(u))} 
= e{[w,v]e(u) 

+ w(tt7(v))e(u) — v(tt7(w))e(u) 

+tt7(w)tt7(v)e(u) — zt7(v)-cc7(w)e(u)} . (8.18) 

The second term of the Riemann curvature operator is 

V[ w ,v]U = e{[w,v]e(u) + t*7([w, v])e(u)} . (8.19) 

The difference of these is the Riemann curvature operator. No- 
tice that the first term in each cancels, and the rest gives equa- 
tion (8.13). 
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Ricci Curvature 

One measure of the curvature is the Ricci tensor, which is com- 
puted from the Riemann tensor by 

i?(u,v) = ^R(e*,u,e;,v). (8.20) 

i 

Expressed as a program: 

(define ((Ricci nabla basis) u v) 

(contract (lambda (ei wi) ((Riemann nabla) wi u ei v)) 
basis) ) 

Einstein’s field equation (9.27) for gravity, which we will encounter 
later, is expressed in terms of the Ricci tensor. 

Exercise 8.1: Ricci of a Sphere 

Compute the components of the Ricci tensor of the surface of a sphere. 

Exercise 8.2: Pseudosphere 

A pseudosphere is a surface in 3-dimensional space. It is a surface of 
revolution of a tractrix about its asymptote (along the z-axis). We can 
make coordinates for the surface (t, 9) where t is the coordinate along the 
asymptote and 9 is the angle of revolution. We embed the pseudosphere 
in rectangular 3-dinrensional space with 

(define (pseudosphere q) 

(let ( (t (ref q 0)) (theta (ref q 1))) 

(up (* (sech t) (cos theta)) 

(* (sech t) (sin theta)) 

(- t (tanh t))))) 

The structure of Christoffel coefficients for the pseudosphere is 

(down 

(down (up (/ (+ (* 2 (expt (cosh t) 2) (expt (sinh t) 2)) 

(* -2 (expt (sinh t) 4)) (expt (cosh t) 2) 

(* -2 (expt (sinh t) 2))) 

(+ (* (cosh t) (expt (sinh t) 3)) 

(* (cosh t) (sinh t)))) 

0) 

(up 0 

(/ (* -1 (sinh t)) (cosh t)))) 

(down (up 0 

(/ (* -1 (sinh t)) (cosh t))) 

(up (/ (cosh t) (+ (expt (sinh t) 3) (sinh t))) 

0 ))) 

Note that this is independent of 9. 

Compute the components of the Ricci tensor. 
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8.2 Torsion 

There are many connections that describe the local properties 
of any particular manifold. A connection has a property called 
torsion, which is computed as follows: 

T(u, v) = V u v — V v u - [u, v], (8.21) 

The torsion takes two vector fields and produces a vector field. 
The torsion depends on the covariant derivative, which is con- 
structed from the connection. 

We account for this dependency by parameterizing the program 
by nabla. 

(define ((torsion-vector nabla) u v) 

(- (- ((nabla u) v) ((nabla v) u) ) 

(commutator u v) ) ) 

(define ((torsion nabla) omega u v) 

(omega ((torsion-vector nabla) u v))) 

The torsion for the connection for the 2-sphere specified by the 
Christoffel coefficients S2-Christof f el above is zero. We demon- 
strate this by applying the torsion to the basis vector fields: 

(f or-each 
(lambda (x) 

(f or-each 
(lambda (y) 

(print-expression 

((((torsion-vector (covariant-derivative sphere-Cartan) ) 
x y) 

(literal-manifold-function ’f S2-spherical) ) 

((point S2-spherical) (up ’thetaO ; phiO))))) 

(list d/dtheta d/dphi))) 

(list d/dtheta d/dphi)) 

0 

0 

0 

0 

Torsion Doesn’t Affect Geodesics 

There are multiple connections that give the same geodesic curves. 
Among these connections there is always one with zero torsion. 
Thus, if you care about only geodesics, it is appropriate to use a 
torsion-free connection. 
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Consider a basis e and its dual e. The components of the torsion 
are 

e k (T(e i ,e j ))=T k j ~T k i +d k j , (8.22) 

where d() are the structure constants of the basis. See equa- 
tions (4.37, 4.38). For a commuting basis the structure constants 
are zero, and the components of the torsion are the antisymmetric 
part of r with respect to the lower indices. 

Recall the geodesic equation (7.79): 

D 2 a\t) + = 0. (8.23) 

jk 

Observe that the lower indices of T are contracted with two copies 
of the velocity. Because the use of T is symmetrical here, any 
asymmetry of T in its lower indices is irrelevant to the geodesics. 
Thus one can study the geodesics of any connection by first sym- 
metrizing the connection, eliminating torsion. The resulting equa- 
tions will be simpler. 


8.3 Geodesic Deviation 

Geodesics may converge and intersect (as in the lines of longitude 
on a sphere) or they may diverge (for example, on a saddle). To 
capture this notion requires some measure of the convergence or 
divergence, but this requires metrics (see Chapter 9). But even 
in the absence of a metric we can define a quantity, the geodesic 
deviation , that can be interpreted in terms of relative acceleration 
of neighboring geodesics from a reference geodesic. 

Let there be a one-parameter family of geodesics, with param- 
eter s, and let T be the vector field of tangent vectors to those 
geodesics: 

V t T = 0. (8.24) 

We can parameterize travel along the geodesics with parameter t: 
a geodesic curve 7 s (t) = c/)J (m s ) where 

fo^(m s ) = (e tT f)(m s ). 


(8.25) 
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Let U = d/ds be the vector field corresponding to the dis- 
placement of neighboring geodesics. Locally, (t, s ) is a coordinate 
system on the 2-dimensional submanifold formed by the family of 
geodesics. The vector fields T and U are a coordinate basis for 
this coordinate system, so [T, U] = 0. 

The geodesic deviation vector field is defined as: 

V T (V T U). (8.26) 

If the connection has zero torsion, the geodesic deviation can 

be related to the Riemann curvature: 

V T (V T U) = — ft(U,T)(T), (8.27) 

as follows, using equation (8.21), 

Vj(VtU) = Vt(VuT), (8.28) 

because both the torsion is zero and [T, U] = 0. Continuing 

Vj(VtU) = Vt(VuT) 

= Vt(VuT) + Vu(V t T) - Vu(V t T) 

= Vu(V t T)-^(U,T)(T) 

= —72.(11, T)(T). (8.29) 

In the last line the first term was dropped because T satisfies the 
geodesic equation (8.24). 

The geodesic deviation is defined without using a metric, but 
it helps to have a metric (see Chapter 9) to interpret the geodesic 
deviation. Consider two neighboring geodesics, with parameters 
s and s + As. Given a metric we can assume that t is propor- 
tional to path length along each geodesic, and we can define a 
distance S(s,t, As) between the geodesics at the same value of the 
parameter t. So the velocity of separation of the two geodesics is 

(VjU)As = di8(s, t, As)s (8.30) 

where s is a unit vector in the direction of increasing s. So VjU 
is the factor of increase of velocity with increase of separation. 
Similarly, the geodesic deviation can be interpreted as the factor 
of increase of acceleration with increase of separation: 


Vt(VjU)As = d\d\ S(s, t, A s)s. 


(8.31) 
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Longitude Lines on a Sphere 

Consider longitude lines on the unit sphere. 8 Let theta be co- 
latitude and phi be longitude. These are the parameters s and 
t, respectively. Then let T be the vector field d/dtheta that is 
tangent to the longitude lines. 

We can verify that every longitude line is a geodesic: 

((omega (((covariant-derivative Cartan) T) T)) m) 

0 

where omega is an arbitrary one-form field. 

Now let U be d/dphi, then U commutes with T: 

(((commutator U T) f) m) 

0 

The torsion for the usual connection for the sphere is zero: 

(let ((X (literal-vector-field ’X-sphere S2-spherical) ) 

(Y (literal-vector-field ’Y-sphere S2-spherical) ) ) 
((((torsion-vector nabla) X Y) f) m)) 

0 

So we can compute the geodesic deviation using Riemann 

((+ (omega ((nabla T) ((nabla T) U))) 

((Riemann nabla) omega TUT)) 
m) 

0 

confirming equation (8.29). 

Lines of longitude are geodesics. How do the lines of longi- 
tude behave? As we proceed from the North Pole, the lines of 
constant longitude diverge. At the Equator they are parallel and 
they converge towards the South Pole. 

Let’s compute VjU and Vt(VjU). We know that the distance 
is purely in the (j) direction, so 


8 The setup for this example is: 

(def ine-coordinates (up theta phi) S2-spherical) 
(define T d/dtheta) 

(define U d/dphi) 

(define m ((point S2-spherical) (up ’thetaO ’phiO))) 
(define Cartan (Christoff el->Cartan S2-Christof f el) ) 
(define nabla (covariant-derivative Cartan)) 
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((dphi ((nabla T) U)) m) 

(/ (cos thetaO ) (sin thetaO ) ) 

((dphi ((nabla T) ((nabla T) U))) m) 
-1 


Let’s interpret these results. On a sphere of radius R the dis- 
tance at colatitude 0 between two geodesics separated by A <f> is 
d((f>, 9, A(f>) = Rsin(9)A(j). Assuming that 6 is uniformly increas- 
ing with time, the magnitude of the velocity is just the ^-derivative 
of this distance: 

(define ((delta R) phi theta Delta-phi) 

(* R (sin theta) Delta-phi)) 

(((partial 1) (delta ’R)) ’phiO ’thetaO ’Delta-phi) 

(* Delta-phi R (cos thetaO)) 

The direction of the velocity is the unit vector in the (j) direction: 

(define phi-hat 

(* (/ 1 (sin theta)) d/dphi)) 

This comes from the fact that the separation of lines of longitude 
is proportional to the sine of the colatitude. So the velocity vector 
field is the product. 

We can measure the (f> component with d(f>: 

((dphi (* (((partial 1) (delta ’R)) 

’phiO ’thetaO ’Delta-phi) 
phi -hat) ) 
m) 

(/ (* Delta-phi R (cos thetaO )) (sin thetaO)) 


This agrees with VjU A<j> for the unit sphere. Indeed, the lines 
of longitude diverge until they reach the Equator and then they 
converge. 

Similarly, the magnitude of the acceleration is 

(((partial 1) ((partial 1) (delta ’R))) 

’phiO ’thetaO ’Delta-phi) 

( * -1 Delta-phi R (sin thetaO)) 

and the acceleration vector is the product of this result with <f>. 
Measuring this with dcj) we get: 
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((dphi (* (((partial 1) ((partial 1) (delta ’R))) 

’phiO ’thetaO ’Delta-phi) 
phi -hat) ) 
m) 

(* -1 Delta-phi R) 

And this agrees with the calculation of VjVjU A0 for the unit 
sphere. We see that the separation of the lines of longitude are 
uniformly decelerated as they progress from pole to pole. 


8.4 Bianchi Identities 

There are some important mathematical properties of the Rie- 
rnann curvature. These identities will be used to constrain the 
possible geometries that can occur. 

A system with a symmetric connection, TT = TjL, is torsion 
free. 9 

(define nabla 

(covariant-derivative 
(Christoff el->Cart an 

(symmetrize-Christoff el 

(literal-Christoff el-2 ’C R4-rect))))) 

(((torsion nabla) omega X Y) 

(typical-point R4-rect)) 

0 


The Bianchi identities are defined in terms of a cyclic-summation 
operator, which is most easily described as a Scheme procedure: 

(define ((cyclic-sum f) x y z) 

(+ (f x y z) 

(f y z x) 

(f z x y))) 


9 Setup for this section: 

(define omega (literal-lf orm-f ield } omega-rect R4-rect)) 
(define X (literal-vector-field ’X-rect R4-rect)) 

(define Y (literal-vector-field ’Y-rect R4-rect)) 

(define Z (literal-vector-field ’ Z-rect R4-rect)) 

(define V (literal-vector-field ’V-rect R4-rect)) 
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The first Bianchi identity is 

R(w,x,y,z) + R(w,y,z,x) + R(w,z,x,y) = 0, 

or, as a program: 

( ( (cyclic-sum 

(lambda (x y z) 

((Riemann nabla) omega x y z))) 

X Y Z) 

(typical-point R4-rect)) 

0 


The second Bianchi identity is 

V x R(w, v, y, z) + V y R(u;, v, z, x) + V z R(w, v, x, y) = 0 

or, as a program: 

( ( (cyclic-sum 

(lambda (x y z) 

( ( (nabla x) (Riemann nabla) ) 
omega V y z))) 

X Y Z) 

(typical-point R4-rect)) 

0 


Things get more complicated when there is torsion, 
make a general connection, which has torsion: 

(define nabla 

(covariant-derivative 
(Christoff el->Cart an 

(literal-Christoff el-2 ’ C R4-rect)))) 

(define R (Riemann nabla)) 

(define T (torsion-vector nabla)) 

(define (TT omega x y) 

(omega (T x y))) 


(8.32) 


(8.33) 


We can 
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The first Bianchi identity is now : 10 

( ( (cyclic-sum 

(lambda (x y z) 

(- (R omega x y z) 

(+ (omega (T (T x y) z)) 

(((nabla x) TT) omega y z))))) 

X Y Z) 

(typical-point R4-rect)) 

0 

and the second Bianchi identity for a general connection is 

( ( (cyclic-sum 

(lambda (x y z) 

(+ (((nabla x) R) omega V y z) 

(R omega V (T x y) z)))) 

X Y Z) 

(typical-point R4-rect)) 

0 


10 The Bianchi identities are much nastier to write in traditional mathematical 
notation than as Scheme programs. 
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Metrics 


We often want to impose further structure on a manifold to allow 
us to define lengths and angles. This is done by generalizing the 
idea of the Euclidean dot product, which allows us to compute 
lengths of vectors and angles between vectors in traditional vector 
algebra. 

For vectors u = u x x + u v y + u z z and v = v x x + v v y + v z z the 
dot product is u ■ v = u x v x + u v v v + u z v z . The generalization is 
to provide coefficients for these terms and to include cross terms, 
consistent with the requirement that the function of two vectors is 
symmetric. This symmetric, bilinear, real-valued function of two 
vector fields is called a metric field. 

For example, the natural metric on a sphere of radius R is 

g(u,v) = i? 2 (d#(u)d0(v) + (sin0) 2 d</>(u)di5i(v)), (9.1) 

and the Minkowski metric on the 4-dimensional space of special 
relativity is 

g(u,v) = dx(u)dx(v) + dy(u)dy(v) + dz(u)dz(v) — c 2 dt(u)df(v).(9.2) 

Although these examples are expressed in terms of a coordinate 
basis, the value of the metric on vector fields does not depend on 
the coordinate system that is used to specify the metric. 

Given a metric field g and a vector field v the scalar field g(v, v) 
is the squared length of the vector at each point of the manifold. 

Metric Music 

The metric can be used to construct a one- form field u u from a 
vector field u, such that for any vector field v we have 

Wu(v) = g(v, u). (9.3) 

The operation of constructing a one-form field from a vector field 
using a metric is called “lowering” the vector field. It is sometimes 
notated as 

lo u = g b (u). 


(9.4) 
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There is also an inverse metric that takes two one-form fields. 
It is defined by the relation 

= X]g _1 (e*,e J )g( e i, e fc),. (9-5) 

3 

where e and e are any basis and its dual basis. 

The inverse metric can be used to construct a vector field 
from a one-form field u>, such that for any one- form held r we 
have 

t(v w ) = g -1 (u;,T). (9.6) 

This definition is implicit, but the vector held can be explicitly 
computed from the one-form held with respect to a basis as fol- 
lows: 

v u; = ^g _1 (w,e*)ei. (9.7) 

i 

The operation of constructing a vector held from a one-form held 
using a metric is called “raising” the one-form held. It is some- 
times notated 

Vo; = g t) (<^)- (9.8) 

The raising and lowering operations allow one to interchange 
the vector helds and the one- form helds. However they should not 
be confused with the dual operation that allows one to construct a 
dual one-form basis from a vector basis or construct a vector basis 
from a one-form basis. The dual operation that interchanges bases 
is defined without assigning a metric structure on the space. 

Lowering a vector held with respect to a metric is a simple 
program: 

(define ((lower metric) u) 

(define (omega v) (metric v u)) 

(procedure->lf orm-f ield omega)) 

But raising a one-form held to make a vector held is a bit more 
complicated: 
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(define (raise metric basis) 

(let ((gi (metric : invert metric basis))) 

(lambda (omega) 

(contract (lambda (e_i w"i) 

(* (gi omega w~i) e_i)) 
basis) ) ) ) 

where contract is the trace over a basis of a two-argument func- 
tion that takes a vector held and a one-form held as its arguments. 1 

(define (contract proc basis) 

(let ((vector-basis (basis->vector-basis basis)) 
(lform-basis (basis->lf orm-basis basis))) 

(s:sigma/r proc 

vector-basis 
lform-basis) ) ) 


9.1 Metric Compatibility 

A connection is said to be compatible with a metric g if the co- 
variant derivative for that connection obeys the “product rule”: 

V x (g(Y, Z)) = g(V x (Y), Z) + g(Y , V X (Z)). (9.9) 

For a metric there is a unique torsion-free connection that is com- 
patible with it. The Christoffel coefficients of the hrst kind are 
computed from the metric by the following: 

f ijk = 5(efc(g(ei,ej)) + e j (g(e i , e fc )) - e i (g(e i , e fc ))), (9.10) 

for the coordinate basis e. We can then construct the Christoffel 
coefficients of the second kind (the ones used previously to dehne 
a connection) by “raising the hrst index.” To do this we dehne a 
function of three vectors, with a weird currying: 

f(v,w)(u) = ^f iifc e l (u)e J (v)e fc (w). (9.11) 

ijk 


1 Notice that raise and lower are not symmetrical. This is because vector 
fields and form fields are not symmetrical: a vector field takes a manifold 
function as its argument, whereas a form field takes a vector held as its ar- 
gument. This asymmetry is not apparent in traditional treatments based on 
index notation. 
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This function takes two vector fields and produces a one-form field. 
We can use it with equation (9.7) to construct a new function that 
takes two vector fields and produces a vector field: 

f(v,w) = ^g _1 (f(v,w),e*)ei. (9.12) 

i 

We can now construct the Christoffel coefficients of the second 
kind: 

rj fc = e i (f(e j ,e fc )) = £ r mjk g-\e m , e 4 ). (9.13) 

m 

The Cartan forms are then just 

= E § Hf(e,,e fc ))e fc . (9.14) 

k k 

So, for example, we can compute the Christoffel coefficients for 
the sphere from the metric for the sphere. First, we need the 
metric: 

(define ((g-sphere R) u v) 

(* (square R) 

(+ (* (dtheta u) (dtheta v)) 

(* (compose (square sin) theta) 

(dphi u) 

(dphi v))))) 

The Christoffel coefficients of the first kind are a complex structure 
with all three indices down: 

( (Christoff el->symbols 

(metric->Christoffel-l (g-sphere ’R) S2-basis)) 

((point S2-spherical) (up ’thetaO ’phiO))) 

( down 

( down ( down 0 0) 

(down 0 (* (* (cos thetaO) (sin thetaO)) (expt R 2)))) 
(down (down 0 (* (* (cos thetaO) (sin thetaO)) (expt R 2))) 
(down (* (* -1 (cos thetaO) (sin thetaO)) (expt R 2)) 
0 ))) 

And the Christoffel coefficients of the second kind have the inner- 
most index up: 
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( (Christoff el->symbols 

(metric->Christoff el-2 (g-sphere ’R) S2-basis)) 

((point S2-spherical) (up ’thetaO ’phiO))) 

( down ( down (up 0 0 ) 

(up 0 (/ (cos thetaO) (sin thetaO ) ) ) ) 

(down (up 0 (/ (cos thetaO) (sin thetaO))) 

(up (* -1 (cos thetaO) (sin thetaO)) 0))) 

Exercise 9.1: Metric Compatibility 

The connections constructed from a metric by equation (9.13) are “met- 
ric compatible,” as described in equation (9.9). Demonstrate that this 
is true for a literal metric, as described on page 6, in R 4 . Your program 
should produce a zero. 


9.2 Metrics and Lagrange Equations 

In the Introduction (Chapter 1) we showed that the Lagrange 
equations for a free particle constrained to a 2-dinrensional surface 
are equivalent to the geodesic equations for motion on that surface. 
We illustrated that in detail in Section 7.4 for motion on a sphere. 

Here we expand this understanding to show that the Christof- 
fel symbols can be derived from the Lagrange equations. Specifi- 
cally, if we solve the Lagrange equations for the acceleration (the 
highest-order derivatives) we find that the Christoffel symbols are 
the symmetrized coefficients of the quadratic velocity terms. 

Consider the Lagrange equations for a free particle, with La- 
grangian 

L 2 (t,x,v) = \g{x){v,v). (9.15) 

If we solve the Lagrange equations for the accelerations, the ac- 
celerations can be expressed with the geodesic equations (7.79): 

W + ° X -1 ° Q)Dq j Dq k = 0. (9.16) 

jk 

We can verify this computationally. Given a metric, we can 
construct a Lagrangian where the kinetic energy is the metric 
applied to the velocity twice: The kinetic energy is proportional 
to the squared length of the velocity vector. 
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(define (metric->Lagrangian metric coordsys) 

(define (L state) 

(let ((q (ref state 1)) (qd (ref state 2))) 

(define v 

(components->vector-f ield (lambda (m) qd) coordsys)) 
((* 1/2 (metric v v) ) ((point coordsys) q)))) 

L) 

The following code compares the Christoffel symbols with the 
coefficients of the terms of second order in velocity appearing in 
the accelerations, determined by solving the Lagrange equations 
for the highest-order derivative. 2 We extract these terms by taking 
two partials with respect to the structure of velocities. Because the 
elementary partials commute we get two copies of each coefficient, 
requiring a factor of 1/2. 

(let* ((metric (literal-metric ’g R3-rect)) 

(q (typical-coords R3-rect)) 

(L2 (metric->Lagrangian metric R3-rect))) 

(+ (* 1/2 

(((expt (partial 2) 2) (Lagrange-explicit L2)) 

(up ’t q (corresponding-velocities q) ) ) ) 

( (Christoff el->symbols 

(metric->Christoff el-2 metric 

(coordinate-system->basis R3-rect) ) ) 

((point R3-rect) q)))) 

( down ( down ( up 0 0 0 ) ( up 0 0 0 ) (up 0 0 0)) 

( down (up 0 0 0) ( up 0 0 0 ) (up 0 0 0) ) 

( down (up 0 0 0) (up 0 0 0) (up 0 0 0 )) ) 

We get a structure of zeros, demonstrating the correspondence be- 

tween Christoffel symbols and coefficients of the Lagrange equa- 
tions. 

Thus, if we have a metric specifying an inner product, the 
geodesic equations are equivalent to the Lagrange equations for 


2 The procedure Lagrange-explicit produces the accelerations of the coordi- 
nates. In this code the division operator (/) multiplies its first argument on 
the left by the inverse of its second argument. 

(define (Lagrange-explicit L) 

(let ((P ((partial 2) L)) 

(F ((partial 1) L))) 

(/ (- F (+ ((partial 0) P) (* ((partial 1) P) velocity))) 

((partial 2) P)))) 
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the Lagrangian that is equal to the inner product of the general- 
ized velocities with themselves. 

Kinetic Energy or Arc Length 

A geodesic is a path of stationary length with respect to variations 
in the path that keep the endpoints fixed. On the other hand, the 
solutions of the Lagrange equations are paths of stationary action 
that keep the endpoints fixed. How are these solutions related? 

The integrand of the traditional action is the Lagrangian, which 
is in this case the Lagrangian L 2 , the kinetic energy. The integrand 
of the arc length is 

Li(t, x , v) = \J g{x){v,v) = yj 2L 2 (t,x,v ) (9.17) 

and the path length is 

t= [ Li(t,q(t),Dq(t))dt. (9.18) 

Jti 

If we compute the Lagrange equations for L 2 we get the La- 
grange equations for L\ with a correction term. Since 

L 2 (t,x,v) = i(Li(t,x,n)) 2 , (9.19) 

and the Lagrange operator for L 2 is 3 

E[L 2 ] = D t d 2 L 2 - 3iL 2 , 

we find 

E[L 2 ] = L x E[Li] + d 2 L x D t L x . (9.20) 

L 2 is the kinetic energy. It is conserved along solution paths, 
since there is no explicit time dependence. Because of the relation 
between L\ and L 2 , L\ is also a conserved quantity. Let L\ take 
the constant value a on the geodesic coordinate path q we are 


3 E is the Euler-Lagrange operator, which gives the residuals of the Lagrange 
equations for a Lagrangian. T extends a configuration-space path q to make a 
state-space path, with as many terms as needed: r[g](t) = ( t , q(t), Dq(t), ■ ■ •) 
The total time derivative Dt is defined by DtF oT[g] = D(F o T[g]) for any 
state function F and path q. The Lagrange equations are E [L\ o T[q] = 0. 
See [19] for more details. 
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considering. Then r = a(f 2 — ii). Since L\ is conserved, (D t Li) o 
r[r/] = 0 on the geodesic path q, and both E[Li] oT[(j] =0 and 
E[L 2 ] o T[g] = 0, as required by equation (9.20). 

Since L 2 is homogeneous of degree 2 in the velocities, L\ is ho- 
mogeneous of degree 1. So we cannot solve for the highest-order 
derivative in the Lagrange-Euler equations derived from L \ : The 
Lagrange equations of the Lagrangian L\ are dependent. But al- 
though they do not uniquely specify the evolution, they do specify 
the geodesic path. 

On the other hand, we can solve for the highest-order derivative 
in E[L 2 ]. This is because Li E[Li] is homogeneous of degree 2. 
So the equations derived from L 2 uniquely determine the time 
evolution along the geodesic path. 

For Two Dimensions 

We can show this is true for a 2-dimensional system with a general 
metric. We define the Lagrangians in terms of this metric: 

(define L2 

(metric->Lagrangian (literal-metric ’m R2-rect) 

R2-rect) ) 


(define (LI state) 

(sqrt (* 2 (L2 state)))) 

Although the mass matrix of L 2 is nonsingular 

(determinant 

(((partial 2) ((partial 2) L2)) 

(up ’t (up ’x ’y) (up ’vx ’vy)))) 

(+ (* (m.00 (up x y) ) (m.ll (up x y))) 

(* -1 (expt (m.01 (up x y) ) 2))) 

the mass matrix of L\ has determinant zero 
(determinant 

(((partial 2) ((partial 2) LI)) 

(up ’t (up ’x ’y) (up ’vx ’vy)))) 

0 

showing that these Lagrange equations are dependent. 

We can show this dependence explicitly, for a simple system. 
Consider the simplest possible system, a geodesic (straight line) 
in a plane: 
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(define (LI state) 

(sqrt (square (velocity state)))) 

( ( (Lagrange-equations LI) 

(up (literal-function ’x) (literal-function ’y))) 

’t) 

( down 

(/(+(* ( ( ( expt D 2) x) t) (expt ( (D y) t) 2)) 

(* -1 ((Ox) t) ( (D y) t) (((expt D 2) y) t) ) ) 

(expt (+ (expt ( (D x) t) 2) (expt ( (D y) t) 2)) 3/2)) 

(/ (+ (* -1 (((expt D 2) x) t) ( (D x) t) ( (D y) t)) 

(* (expt ( (D x) t) 2) (((expt D 2) y) t) ) ) 

(expt (+ (expt ( (D x) t) 2) (expt ( (D y) t) 2)) 3/2))) 

These residuals must be zero; so the numerators must be zero. 4 
They are: 

D 2 x ( Dy ) 2 = Dx Dy D 2 y 
D 2 x Dx Dy = (Dx) 2 D 2 y 

Note that the only constraint is D 2 x Dy = Dx D 2 y , so the result- 
ing Lagrange equations are dependent. 

This is enough to determine that the result is a straight line, 
without specifying the rate along the line. Suppose y = f(x), for 
path (x(t),y(t)). Then 

Dy = Df(x) Dx and D 2 y = D 2 f(x) Dx + Df(x) D 2 x. 
Substituting, we get 

Df(x) Dx D 2 x = Dx(D 2 f(x) Dx + Df(x) D 2 x) 
or 

Df(x) D 2 x = D 2 f(x) Dx + Df(x) D 2 x, 

so D 2 f(x) = 0. Thus / is a straight line, as required. 

Reparameterization 

More generally, a differential equation system F[g](f) = 0 is said 
to be reparameterized if the coordinate path q is replaced with a 


4 We cheated: We hand-simplified the denominator to make the result more 
obvious. 
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new coordinate path q o /. For example, we may change the scale 
of the independent variable. The system F[qo f] = 0 is said to be 
independent of the parameterization if and only if F[q\ o / = 0. 
So the differential equation system is satisfied by q o / if and only 
if it is satisfied by q. 

The Lagrangian L\ is homogeneous of degree 1 in the velocities; 
so 

E[Lr] o T[q o f } - (E^] o T[q] o f)Df = 0. (9.21) 

We can check this in a simple case. For two dimensions q = (x, y), 
the condition under which a reparameterization / of the geodesic 
paths with coordinates q satisfies the Lagrange equations for L\ 
is: 

(let ( (x (literal-function ’x)) 

(y (literal-function ’y)) 

(f (literal-function ’f)) 

(El (Euler-Lagrange-operator LI))) 

((- (compose El 

(Gamma (up (compose x f) 

(compose y f)) 

4)) 

(* (compose El 

(Gamma (up x y) 4) 
f) 

(D f))) 

’t)) 

( down 00) 


This residual is identically satisfied, showing that the Lagrange 
equations for L\ are independent of the parameterization of the 
independent variable. 

The Lagrangian L 2 is homogeneous of degree 2 in the velocities; 
so 

E [L 2 ][q o /] - (E [L 2 ][q] o f)(Df ) 2 = ( 8 2 L 2 o r[g] o f)(D 2 f).( 9.22) 

Although the Euler-Lagrange equations for L\ are invariant under 
an arbitrary reparameterization ( Df 7 ^ 0), the Euler-Lagrange 
equations for L 2 are invariant only for a restricted set of /. The 
conditions under which a reparameterization / of geodesic paths 
with coordinates q satisfies the Lagrange equations for L 2 are: 
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(let ((q (up (literal-function ’x) (literal-function ’y))) 
(f (literal-function ’f))) 

((- (compose (Euler-Lagrange-operator L2) 

(Gamma (compose q f) 4)) 

(* (compose (Euler-Lagrange-operator L2) 

(Gamma q 4) 

f ) 

(expt (D f) 2))) 

’t)) 

( down 


( * 

(+ 

( * 

((D 

X) 

(f 

t) 

) 

(m.00 

(up 

(x 

(f 

t) 

) 

(y 

(f 




( * 

((D 

y) 

(f 

t) 

) 

(m.01 

(up 

(x 

(f 

t) 

) 

(y 

(f 

t))))>) 


( ( (expt D 

2) 

f) 

t) 

) 










( * 

( + 

( * 

((D 

x) 

(f 

t) 

) 

(m.01 

(up 

(x 

(f 

t) 

) 

(y 

(f 




( * 

((D 

y) 

(f 

t) 

) 

(m-11 

(up 

(x 

(f 

t) 

) 

(y 

(f 

t))))>) 


(((expt D 2) f) t)>) 

We see that if these expressions must be zero, then D 2 f = 0. This 
tells us that / is at most affine in t: f(t) = at + b. 

Exercise 9.2: SO (3) Geodesics 

We have derived a basis for SO(3) in terms of incremental rotations 
around the rectangular axes. See equations (4.29, 4.30, 4.31). We can 
use the dual basis to define a metric on SO(3). 

(define (S03-metric vl v2) 

(+ (* (e~x vl) (e~x v2) ) 

(* (e~y vl) (e~y v2)) 

(* (e~z vl) (e~z v2)))) 

This metric determines a connection. Show that uniform rotation about 
an arbitrary axis traces a geodesic on SO (3). 

Exercise 9.3: Curvature of a Spherical Surface 

The 2-dimensional surface of a 3-dimensional sphere can be embedded 
in three dimensions with a metric that depends on the radius: 

(define M (make-manifold S~2-type 23)) 

(define spherical 

(coordinate-system-at ’spherical ’north-pole M) ) 

(def ine-coordinates (up theta phi) spherical) 

(define spherical-basis (coordinate-system- >basis spherical)) 

(define ((spherical-metric r) vl v2) 

(* (square r) 

(+ (* (dtheta vl) (dtheta v2)) 

(* (square (sin theta)) 

(dphi vl) (dphi v2))))) 
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If we raise one index of the Ricci tensor (see equation 8.20) by con- 
tracting it with the inverse of the metric tensor we can further contract 
it to obtain a scalar manifold function: 

R = Y, g(£\^>R(e\ e ^). (9.23) 

ij 

The trace2down procedure converts a tensor that takes two vector fields 
into a tensor that takes a vector held and a one-form held, and then it 
contracts the result over a basis to make a trace. It is useful for getting 
the Ricci scalar from the Ricci tensor, given a metric and a basis. 

(define ((trace2down metric basis) tensor) 

(let ((inverse-metric-tensor 

(metric : invert metric-tensor basis))) 

(contract 
(lambda (vl wl) 

(contract 
(lambda (v w) 

(* (inverse-metric-tensor wl w) 

(tensor v vl))) 
basis) ) 
basis) ) ) 


Evaluate the Ricci scalar for a sphere of radius r to obtain a measure of 
its intrinsic curvature. You should obtain the answer 2/r 2 . 

Exercise 9.4: Curvature of a Pseudosphere 

Compute the scalar curvature of the pseudosphere (see exercise 8.2). 
You should obtain the value —2. 


9.3 General Relativity 

By analogy to Newtonian mechanics, relativistic mechanics has 
two parts. There are equations of motion that describe how parti- 
cles move under the influence of “forces” and there are field equa- 
tions that describe how the forces arise. In general relativity the 
only force considered is gravity. However, gravity is not treated as 
a force. Instead, gravity arises from curvature in the spacetime, 
and the equations of motion are motion along geodesics of that 
space. 

The geodesic equations for a spacetime with the metric 
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g(vi, V 2 ) = - c 2 ^1 + dt(vi)dt(v 2 ) 

+ dx(vi)dx(v 2 ) 

+ dy(vi)dy(v 2 ) 

+ dz(vi)dz(v 2 ) (9.24) 

are Newton’s equations to lowest order in V/c 2 : 

D 2 x(t ) = — gradE(x(i)). (9.25) 

Exercise 9.5: Newton’s Equations 

Verify that Newton’s equations (9.25) are indeed the lowest-order terms 
of the geodesic equations for the metric (9.24). 


Einstein’s field equations tell how the local energy-momentum 
distribution determines the local shape of the spacetime, as de- 
scribed by the metric tensor g. The equations are traditionally 
written 

R,,u - 7;Rgiw + (9.26) 

Zj c 


where R tIU are the components of the Ricci tensor (equation 8.20), 
R is the Ricci scalar (equation 9. 23), 5 and A is the cosmological 
constant. 

T^ v are the components of the stress-energy tensor describing 
the energy-momentum distribution. Equivalently, one can write 


R 


/IIS — 


87 tG 



(9.27) 


where T = 6 


*The tensor with components G M „ = R^i, — ^Rg^v is called the Einstein 
tensor. In his search for an appropriate field equation for gravity, Einstein 
demanded general covariance (independence of coordinate system) and local 
Lorentz invariance (at each point transformations must preserve the line el- 
ement). These considerations led Einstein to look for a tensor equation (see 
Appendix C). 

6 Start with equation (9.26). Raise one index of both sides, and then contract. 
Notice that the trace g £ = 4, the dimension of spacetime. This gets R = 
—(8 7rG/c 4 )T, from which we can deduce equation (9.27). 
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Einstein’s field equations arise from a heuristic derivation by 
analogy to the Poisson equation for a Newtonian gravitational 
field: 

Lap(E) = 4 t xGp (9.28) 

where V is the gravitational potential field at a point, p is the 
mass density at that point, and Lap is the Laplacian operator. 

The time-time component of the Ricci tensor derived from the 
metric (9.24) is the Laplacian of the potential, to lowest order. 

(define (Newton-metric M G c V) 

(let ((a 

(+ 1 (* (/ 2 (square c)) 

(compose V (up x y z)))))) 

(define (g vl v2) 

(+ (* -1 (square c) a (dt vl) (dt v2)) 

(* (dx vl) (dx v2) ) 

(* (dy vl) (dy v2)) 

(* (dz vl) (dz v2)))) 

g)) 

(define (Newton-connection M G c V) 

(Christoff el->Cart an 

(metric->Christoff el-2 (Newton-metric M G c V) 

spacetime-rect -basis) ) ) 


(define nabla 

(covariant-derivative 

(Newton-connection ’M ’G ’:c 

(literal-function ’V (-> (UP Real Real Real) Real))))) 

(((Ricci nabla (coordinate-system->basis spacetime-rect)) 
d/dt d/dt) 

((point spacetime-rect) (up ’t ’x ’y ’z))) 
mess 


The leading terms of the mess are 

(+ (((partial 0) ((partial 0) V)) (up x y z)) 

(((partial 1) ((partial 1) V)) (up x y z)) 

(((partial 2) ((partial 2) V)) (up x y z))) 

which is the Laplacian of V. The other terms are smaller by V / c 2 . 

Now consider the right-hand side of equation (9.27). In the 
Poisson equation the source of the gravitational potential is the 
density of matter. Let the time-time component of the stress- 
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energy tensor T 00 be the matter density p. Here is a program for 
the stress-energy tensor: 

(define (Tdust rho) 

(define (T wl w2) 

(* rho (wl d/dt) (w2 d/dt))) 

T) 

If we evaluate the right-hand side expression we obtain" 

(let ((g (Newton-metric ’M ’G ’:c V))) 

(let ((T_ij ((drop2 g spacetime-rect -basis) (Tdust ’rho)))) 
(let ((T ((trace2down g spacetime-rect-basis) T_ij))) 

((- (T_ij d/dt d/dt) (* 1/2 T (g d/dt d/dt))) 

((point spacetime-rect) (up ’t ’x ’y ’z)))))) 

(* 1/2 (expt :c 4) rho) 


So, to make the Poisson analogy we get 



as required. 


(9.29) 


Exercise 9.6: Curvature of Schwarzschild Spacetime 

In spherical coordinates around a nonrotating gravitating body the met- 
ric of Schwarzschild spacetime is given as: 8 


7 The procedure trace2down is defined on page 144. This expression also uses 
drop2, which converts a tensor field that takes two one-form fields into a tensor 
field that takes two vector fields. Its definition is 

(define ((drop2 metric-tensor basis) tensor) 

(lambda (vl v2) 

(contract 
(lambda (el wl) 

(contract 
(lambda (e2 w2) 

(* (metric-tensor vl el) (tensor wl w2) (metric-tensor e2 v2))) 
basis) ) 
basis) ) ) 


8 The spacetime manifold is built from R 4 with the addition of appropriate 
coordinate systems: 

(define spacetime (make-manifold R"n 4)) 

(define spacetime-rect 

(coordinate-system-at ’rectangular ’origin spacetime)) 

(define spacetime-sphere 

(coordinate-system-at ’spacetime-spherical ’origin spacetime)) 
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(def ine-coordinates (up t r theta phi) spacetime-sphere) 

(define (Schwarzschild-metric M G c) 

(let ((a (- 1 (/ (* 2 G M) (* (square c) r))))) 

(lambda (vl v2) 

(+ (* -1 (square c) a (dt vl) (dt v2)) 

(* (/ 1 a) (dr vl) (dr v2)) 

(* (square r) 

(+ (* (dtheta vl) (dtheta v2)) 

(* (square (sin theta)) 

(dphi vl) (dphi v2)))))))) 

Show that the Ricci curvature of the Schwarzschild spacetime is zero. 
Use the definition of the Ricci tensor in equation (8.20). 

Exercise 9.7: Circular Orbits in Schwarzschild Spacetime 

Test particles move along geodesics in spacetime. Now that we have a 
metric for Schwarzschild spacetime (page 147) we can use it to construct 
the geodesic equations and determine how test particles move. Consider 
circular orbits. For example, the circular orbit along a line of constant 
longitude is a geodesic, so it should satisfy the geodesic equations. Here 
is the equation of a circular path along the zero longitude line. 

(define (prime-meridian r omega) 

(compose (point spacetime-sphere) 

(lambda (t) (up t r (* omega t) 0)) 

(chart Rl-rect))) 

This equation will satisfy the geodesic equations for compatible values 
of the radius r and the angular velocity omega. If you substitute this 
into the geodesic equation and set the residual to zero you will obtain a 
constraint relating r and omega. Do it. 

Surprise: You should find out that w 2 r 3 = GM — Kepler’s law! 

Exercise 9.8: Stability of Circular Orbits 

In Schwarzschild spacetime there are stable circular orbits if the coordi- 
nate r is large enough, but below that value all orbits are unstable. The 
critical value of r is larger than the Schwarzschild horizon radius. Let’s 
find that value. 

For example, we can consider a perturbation of the orbit of constant 
longitude. Here is the result of adding an exponential variation of size 
epsilon: 

(define (prime-meridian+X r epsilon X) 

(compose 

(point spacetime-sphere) 

(lambda (t) 

(up (+ t (* epsilon (* (ref X 0) (exp (* ’lambda t))))) 

(+ r (* epsilon (* (ref X 1) (exp (* ’lambda t))))) 

(+ (* (sqrt (/ (* ’G ’M) (expt r 3))) t) 

(* epsilon (* (ref X 2) (exp (* ’lambda t))))) 

0 )) 

(chart Rl-rect))) 
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Plugging this into the geodesic equation yields a structure of residuals: 

(define (geodesic-equation+X-residuals eps X) 

(let ((gamma (prime-meridian+X ’r eps X))) 

(((((covariant-derivative Cartan gamma) d/dtau) 

((differential gamma) d/dtau)) 

(chart spacetime-sphere)) 

((point Rl-rect) ’t)))) 

The characteristic equation in the eigenvalue lambda can be obtained as 
the numerator of the expression: 

(determinant 

(submatrix (((* (partial 1) (partial 0)) 

geodesic-equation+X-residuals) 

0 

(up 000)) 

0303 )) 

Show that the orbits are unstable if r < 6 GM/c 2 . 

Exercise 9.9: Friedmann-Lemaitre-Robertson- Walker 

The Einstein tensor G (see footnote 5) can be expressed as a program: 

(define (Einstein coordinate-system metric-tensor) 

(let* ((basis (coordinate-system->basis coordinate-system)) 
(connection 
(Christoff el->Cart an 

(metric->Christof f el-2 metric-tensor basis))) 

(nabla (covariant-derivative connection)) 

(Ricci-tensor (Ricci nabla basis)) 

(Ricci-scalar 

((trace2down metric-tensor basis) Ricci-tensor)) ) 

(define (Einstein-tensor vl v2) 

(- (Ricci-tensor vl v2) 

(* 1/2 Ricci-scalar (metric-tensor vl v2)))) 

Einstein-tensor) ) 

(define (Einstein-f ield-equation 

coordinate-system metric-tensor Lambda stress-energy-tensor) 
(let ((Einstein-tensor 

(Einstein coordinate-system metric-tensor))) 

(define EFE-residuals 

(- (+ Einstein-tensor (* Lambda metric-tensor)) 

(* (/ (* 8 :pi :G) (expt :c 4)) 
stress-energy-tensor) ) ) 

EFE-residuals) ) 

One exact solution to the Einstein equations was found by Alexan- 
der Friedmann in 1922. He showed that a metric for an isotropic and 
homogeneous spacetime was consistent with a similarly isotropic and 
homogeneous stress-energy tensor in Einstein’s equations. In this case 
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the residuals of the Einstein equations gave ordinary differential equa- 
tions for the time-dependent scale of the universe. These are called the 
Robertson- Walker equations. Friedmann’s metric is: 

(define (FLRW-metric c k R) 

(def ine-coordinates (up t r theta phi) spacetime-sphere) 

(let ((a (/ (square (compose R t)) (-1 (* k (square r))))) 

(b (square (* (compose R t) r)))) 

(define (g vl v2) 

(+ (* -1 (square c) (dt vl) (dt v2)) 

(* a (dr vl) (dr v2)) 

(* b (+ (* (dtheta vl) (dtheta v2)) 

(* (square (sin theta)) 

(dphi vl) (dphi v2)))))) 
g)) 

Here c is the speed of light, k is the intrinsic curvature, and R is a length 
scale that is a function of time. 

The associated stress-energy tensor is 

(define (Tperf ect-f luid rho p c metric) 

(def ine-coordinates (up t r theta phi) spacetime-sphere) 

(let* ((basis (coordinate-system->basis spacetime-sphere)) 
(inverse-metric (metric : invert metric basis))) 

(define (T wl w2) 

(+ (* (+ (compose rho t) 

(/ (compose p t) (square c))) 

(wl d/dt) (w2 d/dt)) 

(* (compose p t) (inverse-metric wl w2)))) 

T)) 


where rho is the energy density, and p is the pressure in an ideal fluid 
model. 

The Robertson- Walker equations are: 


/ DR(t) 

R(t) 


kc 2 Ac 2 


J mw 3 

— ^Ac 2 = —87 tG f 


Pit) 

3 



(9.30) 


Use the programs supplied to derive the Robertson- Walker equations. 


Exercise 9.10: Cosmology 

For energy to be conserved, the stress-energy tensor must be constrained 
so that its covariant divergence is zero 

^V e ,T(e^, W )=0 (9.31) 


for every one- form uj. 
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a. Show that for the perfect fluid stress-energy tensor and the FLRW 
metric this constraint is equivalent to the differential equation 

D{c 2 pR 3 ) + pD(R 3 ) = 0. (9.32) 

b. Assume that in a “matter-dominated universe” radiation pressure is 
negligible, so p = 0. Using the Robertson- Walker equations (9.30) and 
the energy conservation equation (9.32) show that the observation of an 
expanding universe is compatible with a negative curvature universe, a 
flat universe, or a positive curvature universe: k £ {—1,0,+!}. 
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Hodge Star and Electrodynamics 


The vector space of p-form fields on an n-dimensional manifold has 
dimension n!/((n— p)!p!). This is the same dimension as the space 
of (n — p)-form fields. So these vector spaces are isomorphic. If 
we have a metric there is a natural isomorphism: for each p-form 
field u> on an n-dimensional manifold there is an (n — p)-form 
field g*uj, called its Hodge dual. 1 The Hodge dual should not 
be confused with the duality of vector bases and one-form bases, 
which is defined without reference to a metric. The Hodge dual is 
useful for the elegant formalization of electrodynamics. 

In Euclidean 3-space, if we think of a one-form as a foliation 
of the space, then the dual is a two-form, which can be thought 
of as a pack of square tubes, whose axes are perpendicular to the 
leaves of the foliation. The original one-form divides these tubes 
up into volume elements. For example, the dual of the basis one- 
form dx is the two-form g*dx = dy A dz. We may think of dx 
as a set of planes perpendicular to the x-axis. Then g*dx is a 
set of tubes parallel to the x-axis. In higher-dimensional spaces 
the visualization is more complicated, but the basic idea is the 
same. The Hodge dual of a two-form in four dimensions is a two- 
form that is perpendicular to the given two-form. However, if the 
metric is indefinite (e.g., the Lorentz metric) there is an added 
complication with the signs. 

The Hodge dual is a linear operator, so it can be defined by 
its action on the basis elements. Let {<9/<9x°, . . . , <9/<9x n_1 } be an 
orthonormal basis of vector fields 2 and let {dx°, . . . , dx n_1 } be 
the ordinary dual basis for the one-forms. Then the (n — p)-form 
g*tu that is the Hodge dual of the p-form u> can be defined by its 
coefficients with respect to the basis, using indices, as 


J2 Vio...* P -i^ OJO ---5 ip - 1Jp - 1 eio..dn-i ) (10.1) 

io"'ip—ijo"'jp—i 


lr The traditional notion is to just use an asterisk; we use g* to emphasize that 
this duality depends on the choice of metric g. 

2 We have a metric, so we can define “orthonormal” and use it to construct 
an orthonormal basis given any basis. The Gram-Schmidt procedure does the 
job. 
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where g %3 are the coefficients of the inverse metric and is 

either —1 or +1 if the permutation {0 . . . n — 1} — > {jo ■ ■ ■ j n - 1 } is 
odd or even, respectively. 


Relationship to Vector Calculus 

In 3-dimensional Euclidean space the traditional vector derivative 
operations are gradient, curl, and divergence. If x, y, z are the 
usual orthonormal rectangular vector basis, / a function on the 
space, and v a vector field on the space, then 


.... df „ df , d /„ 
grad(/ = 7 t~x + jfy + -fz, 
ox dy dz 


curl(v) = 
div(v) = 


dv z 

dy 

dv x 

dx 


OVy 

dz 


A ( dv x 
X+ \ dz 


+ 


dv y dvz_ 
dy dz 



(dvy 

\ dx 


dv x \ 
dy ) 


x, 


Recall the meaning of the traditional vector operations. Tra- 
ditionally we assume that there is a metric that allows us to de- 
termine distances between locations and angles between vectors. 
Such a metric establishes local scale factors relating coordinate in- 
crements to actual distances. The vector gradient, grad(/), points 
in the direction of steepest increase in the function with respect to 
actual distances. By contrast, the gradient one-form, df, does not 
depend on a metric, so there is no concept of distance built in to 
it. Nevertheless, the concepts are related. The gradient one-form 
is given by 


df= (l f ) dx+ (l f ) cly+ (i f ) clz ' (10 ' 2) 

The traditional gradient vector field is then just the raised gradient 
one- form (see equation 9.8). So 

grad(f) = g # (df) (10.3) 


is computed by 

(define (gradient metric basis) 
(compose (raise metric basis) d) ) 
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Let 6 be a one-form field: 

6 = 9 x Ax + Oydy + 9 Z dz. 

We compute 

' d 9 Z dO v \ f 89 x 89 z , 

d0 = ( dy /\ d Z + — ^ ] dz A dx 

8y 8z J \ 8z 8x 


(10.4) 


+ 


/ 89,, 89, 


\ 8x 8y 


dx A dy. 


So the exterior-derivative expression corresponding to the 
calculus curl is: 

(86 z 89 y \ f 89 x d9 z \ 

g(d<,)= ( .37" &y dx+ (.aT- «A) dy 

/ 89 y 89 x \ 

+ Ur~ 

Thus, the curl of a vector field v is 

curl(v) = g # (g*(d(g 3 (v)))), 

which can be computed with 

(define (curl metric orthonormal-basis) 

(let ((star (Hodge-star metric orthonormal-basis)) 
(sharp (raise metric orthonormal -basis) ) 

(flat (lower metric))) 

(compose sharp star d flat))) 


(10.5) 

vector- 


( 10 . 6 ) 


(10.7) 


Also, we compute 


d(g*fl) = 


89 T 89,, 89 , 


+ 


+ 


dx 8y dz 


dx A dy A dz. 


So the exterior-derivative expression corresponding to the 
calculus div is 


g*d(g *9) = 


89 T 89,, 89, 


+ 


+ 


dx 8y dz ' 

Thus, the divergence of a vector field v is 
div(v) = g*(d(g*(g b (v))))- 


( 10 . 8 ) 

vector- 

(10.9) 


(10.10) 
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It is easily computed: 

(define (divergence metric orthonormal-basis) 

(let ((star (Hodge-star metric orthonormal-basis)) 

(flat (lower metric))) 

(compose star d star flat))) 

The divergence is defined even if we don’t have a metric, but 
have only a connection. In that case the divergence can be com- 
puted with 

(define (((divergence Cartan) v) point) 

(let ((basis (Cartan->basis Cartan)) 

(nabla (covariant-derivative Cartan))) 

(contract 
(lambda (ei wi) 

((wi ((nabla ei) v) ) point)) 
basis) ) ) 

If the Cartan form is derived from a metric these programs yield 
the same answer. 

The Laplacian is, as expected, the composition of the divergence 
and the gradient: 

(define (Laplacian metric orthonormal-basis) 

(compose (divergence metric orthonormal -basis) 

(gradient metric orthonormal-basis))) 

Spherical Coordinates 

We can illustrate these by computing the formulas for the vector- 
calculus operators in spherical coordinates. We start with a 
3-dimensional manifold, and we set up the conditions for spherical 
coordinates. 

(define spherical R3-rect) 

(def ine-coordinates (up r theta phi) spherical) 

(define R3-spherical-point 

((point spherical) (up ’rO ’thetaO ’phiO))) 

The geometry is specified by the metric: 
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(define (spherical -metric vl v2) 

(+ (* (dr vl) (dr v2)) 

(* (square r) 

(+ (* (dtheta vl) (dtheta v2)) 

(* (expt (sin theta) 2) 

(dphi vl) (dphi v2)))))) 

We also need an orthonormal basis for the spherical coordinates. 
The coordinate basis is orthogonal but not normalized. 

(define e_0 d/dr) 

(define e_l (* (/ 1 r) d/dtheta)) 

(define e_2 (* (/ 1 (* r (sin theta))) d/dphi)) 

(define orthonormal-spherical-vector-basis 
(down e_0 e_l e_2)) 

(define orthonormal-spherical-lf orm-basis 

(vector-basis->dual orthonormal-spherical-vector-basis 
spherical) ) 

(define orthonormal-spherical-basis 

(make-basis orthonormal-spherical-vector-basis 
orthonormal-spherical-lf orm-basis) ) 

The components of the gradient of a scalar field are obtained 
using the dual basis: 

( (orthonormal-spherical-lf orm-basis 

((gradient spherical -metric orthonormal-spherical-basis) 
(literal -manifold-function ’ f spherical))) 
R3-spherical-point) 

(up (((partial 0) f) (up rO thetaO phiO)) 

(/ (((partial 1) f) (up rO thetaO phiO)) 
rO) 

(/ (((partial 2) f) (up rO thetaO phiO)) 

( * rO (sin thetaO)))) 

To get the formulas for curl and divergence we need a vector 
field with components with respect to the normalized basis. 

(define v 

(+ (* (literal-manifold-function ’v"0 spherical) e_0) 

(* (literal-manifold-function ’v"l spherical) e_l) 

(* (literal-manifold-function ’v~2 spherical) e_2))) 
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The curl is a bit complicated: 

( (orthonormal-spherical-lf orm-basis 

((curl spherical-metric orthonormal-spherical-basis) v)) 
R3-spherical-point) 

(up 

(/ (+ (* (sin thetaO ) 

(((partial 1) v~2) (up rO thetaO phiO))) 

(* (cos thetaO) (v~2 (up rO thetaO phiO))) 

(* -1 (((partial 2) v~l) (up rO thetaO phiO)))) 

( * rO (sin thetaO))) 

(/ (+ (* -1 rO (sin thetaO) 

(((partial 0) v~2) (up rO thetaO phiO))) 

(* -1 (sin thetaO) (v~2 (up rO thetaO phiO))) 

(((partial 2) v~0) (up rO thetaO phiO))) 

( * rO (sin thetaO))) 

(/ (+ ( * rO (((partial 0) v~l) (up rO thetaO phiO))) 

(v~l (up rO thetaO phiO)) 

(* -1 (((partial 1) v~0) (up rO thetaO phiO)))) 
rO)) 

But the divergence and Laplacian are simpler 

(((divergence spherical-metric orthonormal-spherical-basis) v) 
R3-spherical-point) 

( + (((partial 0) v~0) (up rO thetaO phiO)) 

(/(* 2 (v~0 (up rO thetaO phiO))) rO) 

(/ (( (partial 1) v~l) (up rO thetaO phiO)) rO) 

(/ (* (v~l (up rO thetaO phiO)) (cos thetaO)) 

( * rO (sin thetaO))) 

(/ (( (partial 2) v~2) (up rO thetaO phiO)) 

( * rO (sin thetaO)))) 

(((Laplacian spherical -metric orthonormal-spherical-basis) 
(literal -manifold-function ’f spherical)) 

R3-spherical-point) 

( + (( (partial 0) ((partial 0) f)) (up rO thetaO phiO)) 

(/(* 2 (((partial 0) f) (up rO thetaO phiO))) 
rO) 

(/ (( (partial 1) ((partial 1) f) ) (up rO thetaO phiO)) 

(expt rO 2) ) 

(/ (* (cos thetaO) (((partial 1) f) (up rO thetaO phiO))) 

(* (expt rO 2) (sin thetaO))) 

(/ (( (partial 2) ((partial 2) f)) (up rO thetaO phiO)) 

(* (expt rO 2) (expt (sin thetaO) 2)))) 
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10.1 The Wave Equation 

The kinematics of special relativity can be formulated on a flat 
4-dimensional spacetime manifold. 

(define SR R4-rect) 

(def ine-coordinates (up ct x y z) SR) 

(define an-event ((point SR) (up ’ctO ’x0 ’y0 ’z0))) 

(define a-vector 

(+ (* (literal-manifold-function ’v~t SR) d/dct) 

(* (literal-manifold-function ’v~x SR) d/dx) 

(* (literal-manifold-function ’v~y SR) d/dy) 

(* (literal-manifold-function ’v~z SR) d/dz))) 

The Minkowski metric is 3 

g(u,v)= (10.11) 

— c 2 dt(u) dt(v) + dx(u) dx(v) + dy(u) dy(v) + dz(u) dz(v). 

As a program: 

(define (g-Minkowski u v) 

(+ (* -1 (dct u) (dct v) ) 

(* (dx u) (dx v)) 

(* (dy u) (dy v) ) 

(* (dz u) (dz v)))) 

The length of a vector is described in terms of the metric: 

cr = g(v,v). (10.12) 

If a is positive the vector is spacelike and its square root is the 
proper length of the vector. If a is negative the vector is timelike 
and the square root of its negation is the proper time of the vector. 
If a is zero the vector is lightlike or null. 

((g-Minkowski a-vector a-vector) an-event) 

(+ (* -1 (expt (v~t (up ctO xO yO z0)) 2)) 

(expt (v~x (up ctO xO yO z0)) 2) 

(expt (v~y (up ctO xO yO z0)) 2) 

(expt (v~z (up ctO xO yO zO)) 2)) 


3 The metric in relativity is not positive definite, so nonzero vectors can have 
zero length. 
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As an example of vector calculus in four dimensions, we can 
compute the wave equation for a scalar field in 4-dimensional 
spacetime. 

We need an orthonormal basis for the spacetime: 

(define SR-vector-basis (coordinate-system->vector-basis SR)) 

We check that it is orthonormal with respect to the metric: 

( (g-Minkowski SR-vector-basis SR-vector-basis) an-event) 

(down (down -1 0 0 0 ) 

( down 0100) 

( down 0 0 10) 

( down 0001)) 

So, the Laplacian of a scalar held is the wave equation! 

(define p (literal -manifold-function ’phi SR)) 

(((Laplacian g-Minkowski SR-basis) p) an-event) 

( + (((partial 0) ((partial 0) phi)) (up ctO xO yO zO)) 

( * -1 (( (partial 1) ((partial 1) phi)) (up ctO xO yO zO ))) 

( * -1 (( (partial 2) ((partial 2) phi)) (up ctO xO yO zO ))) 

(* -1 (((partial 3) ((partial 3) phi)) (up ctO xO yO zO)))) 


10.2 Electrodynamics 

Using Hodge duals we can represent electrodynamics in an elegant 
way. Maxwell’s electrodynamics is invariant under Lorentz trans- 
formations. We use 4-dimensional rectangular coordinates for the 
flat spacetime of special relativity. 

In this formulation of electrodynamics the electric and magnetic 
fields are represented together as a two-form held, the Faraday 
tensor. Under Lorentz transformations the individual components 
are mixed. The Faraday tensor is: 4 

(define (Faraday Ex Ey Ez Bx By Bz) 

(+ (* Ex (wedge dx dct)) 

(* Ey (wedge dy dct)) 

(* Ez (wedge dz dct)) 

(* Bx (wedge dy dz)) 

(* By (wedge dz dx)) 

(* Bz (wedge dx dy)))) 


4 This representation is from Misner, Thorne, and Wheeler, Gravitation, p.108. 
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The Hodge dual of the Faraday tensor exchanges the electric and 
magnetic fields, negating the components that will involve time. 
The result is called the Maxwell tensor. 

(define (Maxwell Ex Ey Ez Bx By Bz) 

(+ (* -1 Bx (wedge dx dct)) 

(* -1 By (wedge dy dct)) 

(* -1 Bz (wedge dz dct)) 

(* Ex (wedge dy dz)) 

(* Ey (wedge dz dx)) 

(* Ez (wedge dx dy)))) 

We make a Hodge dual operator for this situation: 

(define SR-star (Hodge-star g-Minkowski SR-basis)) 

And indeed, it transforms the Faraday tensor into the Maxwell 
tensor: 

(((- (SR-star (Faraday ’Ex ’Ey ’Ez ’Bx ’By ’Bz)) 

(Maxwell ’Ex ’Ey ’Ez ’Bx ’By ’Bz)) 

(literal-vector-field ’u SR) 

(literal-vector-field ’v SR)) 
an-event) 


One way to get electric fields is to have charges; magnetic fields 
can arise from motion of charges. In this formulation we combine 
the charge density and the current to make a one-form field: 

(define (J charge-density lx Iy Iz) 

(- (* (/I :c) (+ (* lx dx) (* Iy dy) (* Iz dz))) 

(* charge-density dct))) 

The coefficient (/ 1 :c) makes the components of the one-form 
uniform with respect to units. 

To develop Maxwell’s equations we need a general Faraday held 
and a general current-density held: 

(define F 

(Faraday (literal-manifold-function ’Ex SR) 

(literal -manifold-function ’Ey SR) 

(literal -manifold-function ’Ez SR) 

(literal -manifold-function ’Bx SR) 

(literal -manifold-function ’By SR) 

(literal -manifold-function ’Bz SR))) 
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(define 4-current 

(J (literal -manifold-function ’rho SR) 

(literal -manifold-function ’lx SR) 

(literal -manifold-function ’Iy SR) 

(literal -manifold-function ’Iz SR))) 

Maxwell’s Equations 

Maxwell’s equations in the language of differential forms are 

dF = 0, (10.13) 

d(g* F) = 47 t g* J. (10.14) 

The first equation gives us what would be written in vector nota- 
tion as 


di vB = 0, (10.15) 

-> 1 dB 

cur 11? = — . (10.16) 

c dt y ’ 

The second equation gives us what would be written in vector 
notation as 

divE = 47T/9, (10.17) 

curl B = -^ + —I. (10.18) 

c dt c 

To see how these work out, we evaluate each component of 
dF and d(g*F) — 47rg* J. Since these are both two-form fields, 
their exterior derivatives are three-form fields, so we have to pro- 
vide three basis vectors to get each component. Each component 
equation will yield one of Maxwell’s equations, written in coordi- 
nates, without vector notation. So, the purely spatial component 
dF (d/dx,d/dy,d/dz) of equation 10.13 is equation 10.15: 

(((d F) d/dx d/dy d/dz) an-event) 

( + (((partial 1) Bx) (up ctO xO yO z0)) 

(( (partial 2) By) (up ctO xO yO z0)) 

(( (partial 3) Bz ) (up ctO xO yO zO))) 

dB x dB y dB z 
dx dy dz 


(10.19) 
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The three mixed space and time components of equation 10.13 
are equation 10.16: 

( ( (d F) d/dct d/dy d/dz) an-event) 

( + (((partial 0) Bx) (up ctO xO yO zO)) 

(((partial 2) Ez) (up ctO xO yO zO)) 

( * -1 (( (partial 3) Ey) (up ctO xO yO zO)))) 

dE z dE y _ 1 8B X 
dy dz c dt ’ 

( ( (d F) d/dct d/dz d/dx) an-event) 

(+ (( (partial 0) By) (up ctO xO yO zO)) 

(((partial 3) Ex) (up ctO xO yO zO)) 

( * -1 (((partial 1) Ez) (up ctO xO yO zO)))) 

( 10 . 

( ( (d F) d/dct d/dx d/dy) an-event) 

( + (( (partial 0) Bz ) (up ctO xO yO zO)) 

(((partial 1) Ey) (up ctO xO yO zO)) 

( * -1 (( (partial 2) Ex) (up ctO xO yO zO)))) 


c)E x d E z _ 1 dB y 
dz dx c dt ’ 


(10.20) 


dE y dE x _ 1 dB z 
dx dy c dt 


(10.22) 


The purely spatial component of equation 10.14 is equation 10.17: 


(((- (d (SR-star F)) (* 4 :pi (SR-star 4-current))) 
d/dx d/dy d/dz) 
an-event) 

(+ (* -4 :pi ( rho (up ctO xO yO zO))) 

(((partial 1) Ex) (up ctO xO yO zO)) 

(((partial 2) Ey) (up ctO xO yO zO)) 

(((partial 3) Ez) (up ctO xO yO zO))) 


dE x dE y dE z 
dx dy dz 


Att p. 


(10.23) 
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And finally, the three mixed time and space components of 
equation 10.14 are equation 10.18: 

(((- (d (SR-star F)) (* 4 :pi (SR-star 4-current))) 
d/dct d/dy d/dz) 
an-event) 

( + (((partial 0) Ex) (up ctO xO yO z0)) 

( * -1 (((partial 2) Bz) (up ctO xO yO zO))) 

(((partial 3) By) (up ctO xO yO zO)) 

(/(* 4 :pi (lx (up ctO xO yO zO))) :c) ) 

dB y _ dB £ _ _1 dE^ _ 4 7r 
dz dy c dt c 

(((- (d (SR-star F)) (* 4 :pi (SR-star 4-current))) 
d/dct d/dz d/dx) 
an-event) 

(+ (( (partial 0) Ey) (up ctO xO yO zO)) 

( * -1 (((partial 3) Bx) (up ctO xO yO zO))) 

(((partial 1) Bz) (up ctO xO yO zO)) 

(/ (* 4 :pi (Iy (up ctO xO yO zO ) ) ) :c) ) 

OIL dB x i a Ey 4 7T 

dx dz c dt c ' 

(((- (d (SR-star F)) (* 4 :pi (SR-star 4-current))) 
d/dct d/dx d/dy) 
an-event) 

( + (((partial 0) Ez) (up ctO xO yO zO)) 

( * -1 (( (partial 1) By) (up ctO xO yO zO))) 

(((partial 2) Bx) (up ctO xO yO zO)) 

(/ (* 4 :pi (Iz (up ctO xO yO zO ) ) ) :c ) ) 

dB x dB y _ 1 dE z _ 4?r 7 

dy dx c dt c 

Lorentz Force 

The classical force on a charged particle moving in a electromag- 
netic held is 

/ = q (e + x B^j . (10.27) 

We can compute this in coordinates. We construct arbitrary E 
and B vector fields and an arbitrary velocity: 


(10.26) 


(10.25) 


(10.24) 
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(define E 

(up (literal-manifold-function ’Ex SR) 
(literal-manifold-function ’Ey SR) 
(literal-manifold-function ’Ez SR))) 

(define B 

(up (literal-manifold-function ’Bx SR) 
(literal-manifold-function ’By SR) 
(literal-manifold-function ’Bz SR))) 

(define V (up ’V_x ’V_y ’V_z)) 

The 3-space force that results is a mess: 

(* ’q (+ (E an-event) (cross-product V (B an-event)))) 

(up (+ (* q (Ex (up cto xO yO zO))) 

(* q V.y (Bz (up ctO xO yO zO))) 

(* -1 q V.z (By (up ctO xO yO zO)))) 

(+ (* q (Ey (up ctO xO yO zO))) 

(* -1 q V.x (Bz (up ctO xO yO zO))) 

(* q V.z (Bx (up ctO xO yO zO)))) 

(+ (* q (Ez (up ctO xO yO zO))) 

(* q V.x (By (up ctO xO yO zO))) 

(* -1 q V.y (Bx (up ctO xO yO zO))))) 

The relativistic Lorentz 4-force is usually written in coordinates 
as 

r = -^g^VT', (io. 28 ) 

a,/x 

where U is the 4- velocity of the charged particle, F is the Faraday 
tensor, and ij au are the components of the inverse of the Minkowski 
metric. Here is a program that computes a component of the force 
in terms of the Faraday tensor. The desired component is specified 
by a one-form. 

(define (Force charge F 4velocity component) 

(* -1 charge 

(contract (lambda (a b) 

(contract (lambda (e w) 

(* (w 4velocity) 

(F e a) 

(eta-inverse b component))) 
SR-basis) ) 


SR-basis) ) ) 
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So, for example, the force in the x direction for a stationary par- 
ticle is 

((Force ’q F d/dct dx) an-event) 

(* q (Ex (up cto xO yO zO))) 

Notice that the 4- velocity d/dct is the 4- velocity of a stationary 
particle! 

If we give a particle a more general timelike 4-velocity in the 
x direction we can see how the y component of the force involves 
both the electric and magnetic field: 

(define (Ux beta) 

(+ (* (/ 1 (sqrt (- 1 (square beta)))) d/dct) 

(* (/ beta (sqrt (- 1 (square beta)))) d/dx))) 

((Force ’q F (Ux ’v/c) dy) an-event) 

(/ (+ (* -1 q v/c (Bz (up cto xO yO zO))) 

(* q (Ey (up ctO xO yO zO)))) 

(sqrt (+ 1 ( * -1 (expt v/c 2))))) 

Exercise 10.1: Relativistic Lorentz Force 

Compute all components of the 4-force for a general timelike 4-velocity. 

a. Compare these components to the components of the nonrelativistic 
force given above. Interpret the differences. 

b. What is the meaning of the time component? For example, consider: 

((Force ’q F (Ux ’v/c) dct) an-event) 

(/ (* q v/c (Ex (up cto xO yO zO))) 

(sqrt (+ 1 (* -1 (expt v/c 2))))) 

c. Subtract the structure of components of the relativistic 3-space force 
from the structure of the spatial components of the 4-space force to show 
that they are equal. 




11 

Special Relativity 


Although the usual treatments of special relativity begin with the 
Michelson-Morley experiment, this is not how Einstein began. In 
fact, Einstein was impressed with Maxwell’s work and he was em- 
ulating Maxwell’s breakthrough. 

Maxwell was preceded by Faraday, Ampere, Oersted, Coulomb, 
Gauss, and Franklin. These giants discovered electromagnetism 
and worked out empirical equations that described the phenom- 
ena. They understood the existence of conserved charges and 
fields. Faraday invented the idea of lines of force by which fields 
can be visualized. 

Maxwell’s great insight was noticing and resolving the contra- 
diction between the empirically-derived laws of electromagnetism 
and conservation of charge. He did this by introducing the then ex- 
perimentally undetectable displacement-current term into one of 
the empirical equations. The modified equations implied a wave 
equation and the propagation speed of the wave predicted by the 
new equation turned out to be the speed of light, as measured 
by the eclipses of the Galilean satellites of Jupiter. The experi- 
mental confirmation by Hertz of the existence of electromagnetic 
radiation that obeyed Maxwell’s equations capped the discovery. 

By analogy, Einsten noticed that Maxwell’s equations were in- 
consistent with Galilean relativity. In free space, where electro- 
magnetic waves propagate, Maxwell’s equations say that the vec- 
tor source of electric fields is the time rate of change of the mag- 
netic field and the vector source of magnetic field is the time rate 
of change of the electric field. The combination of these ideas 
yields the wave equation. The wave equation itself is not invari- 
ant under the Galilean transformation: As Einstein noted, if you 
run with the propagation speed of the wave there is no time vari- 
ation in the held you observe, so there is no space variation ei- 
ther, contradicting the wave equation. But the Maxwell theory 
is beautiful, and it can be verified to a high degree of accuracy, 
so there must be something wrong with Galilean relativity. Ein- 
stein resolved the contradiction by generalizing the meaning of the 
Lorentz transformation, which was invented to explain the failure 
of the Michelson-Morley experiment. Lorentz and his colleagues 
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decided that the problem with the Michelson-Morley experiment 
was that matter interacting with the luminiferous ether contracts 
in the direction of motion. To make this consistent he had to in- 
vent a “local time” which had no clear interpretation. Einstein 
took the Lorentz transformation to be a fundamental replacement 
for the Galilean transformation in all of mechanics. 

Now to the details. Before Maxwell the empirical laws of elec- 
tromagnetism were as follows. Electric fields arise from charges, 
with the inverse square law of Coulomb. This is Carl Friedrich 
Gauss’s law for electrostatics: 

div E = 4-7T/9. (11-1) 

Magnetic fields do not have a scalar source. This is Gauss’s law 
for magnetostatics: 

div B = 0. (11.2) 

Magnetic fields are produced by electric currents, as discovered by 
Hans Christian Oersted and quantified by Andre-Marie Ampere: 

-> 47T -> 

curl B = — I. (11.3) 

c 

Michael Faraday (and Joseph Henry) discovered that electric fields 
are produced by moving magnetic fields: 

curl E = — (11-4) 
c dt y ’ 

Benjamin Franklin was the first to understand that electrical 
charges are conserved: 

div / + ^ = 0. (11-5) 

Although these equations are written in terms of the speed of 
light c, these laws were originally written in terms of electrical 
permittivity and magnetic permeability of free space, which could 
be determined by measurement of the forces for given currents 
and charges. 
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It is easy to see that these equations are mutually contradictory. 
Indeed, if we take the divergence of equation (11.3) we get 

-> 47T -* 

div curl B = 0 = — div I, (11.6) 

c 

which directly contradicts conservation of charge (11.5). 

Maxwell patched this bug by adding in the displacement cur- 
rent, changing equation (11.3) to read 


IdE 4 ir r 

curl B = — | 1. 

c at c 


(11.7) 


Maxwell proceeded by taking the curl of equation (11.4) to get 

( 11 . 8 ) 


-> —1 d -> 

curl curl E = — curl B. 

c dt 


Expanding the left-hand side 

-n> -n» —1 8 

grad div E — Lap E = — curl B, 

c dt 


(11.9) 


substituting from equations (11.7) and (11.1), and rearranging the 
terms we get the inhomogeneous wave equation: 


Lap E 


1 d 2 E 



( 11 . 10 ) 


We see that in free space (in the absence of any charges or currents) 
we have the familiar homogeneous linear wave equation. A similar 
equation can be derived for the magnetic field. 

Lorentz, whom Einstein also greatly respected, developed a gen- 
eral formula to describe the force on a particle with charge q mov- 
ing with velocity v in an electromagnetic field: 


F = qE + -v X B. 
c 


( 11 . 11 ) 


A crucial point in Einstein’s inspiration for relativity is, quoting 
Einstein (in English translation), “During that year [1895-1896] 
in Aarau the question came to me: If one runs after a light wave 
with light velocity, then one would encounter a time-independent 
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wavefield. However, something like that does not seem to exist!” 1 
This was the observation of the inconsistency. 

Let’s be more precise about this. Consider a plane sinusoidal 
wave moving in the x direction with velocity c in free space (p = 0 
and 1 = 0). This is a perfectly good solution of the wave equation. 
Now suppose that an observer is moving with the wave in the 
x direction with velocity c. Such an observer will see no time 
variation of the held. So the wave equation reduces to Laplace’s 
equation. But a sinusoidal variation in space is not a solution of 
Laplace’s equation. 

Einstein believed that the Maxwell-Lorentz electromagnetic 
theory was fundamentally correct, though he was unhappy with an 
apparent asymmetry in the formulation. Consider a system con- 
sisting of a conductor and a magnet. If the conductor is moved and 
the magnet is held stationary (a stationary magnetic held) then 
the charge carriers in the conductor are subject to the Lorentz 
force (11.11), causing them to move. However, if the magnet is 
moved past a stationary conductor then the changing magnetic 
held induces an electric held in the conductor by equation (11.4), 
which causes the charge carriers in the conductor to move. The 
actual current which results is identical for both explanations if 
the relative velocity of the magnet and the conductor are the same. 
To Einstein, there should not have been two explanations for the 
same phenomenon. 


Invariance of the Wave Equation 

Let u = ( t,x,y,z ) be a tuple of time and space coordinates that 
specify a point in spacetime. 2 If <j)(t,x,y,z ) is a scalar held over 
time and space, the homogeneous linear wave equation is 


d 2 4>(u) d 2 c()(u) d 2 4>(u ) 1 d 2 (f>(u) _ 

dx 2 ~l~ dy 2 ~l~ dz 2 c 2 dt 2 


The characteristics for this equation are the “light cones.” If 
we define a function of spacetime points and increments, length, 


lr The quote is from Pais [12], p. 131. 

2 Points in spacetime are often called events. 
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such that for an incremental tuple in position and time £ = 
(At, Ax, Ay, Az) we have 3 

lengthy (£) = yj (Ax) 2 + (Ay) 2 + (Az) 2 - (c At) 2 , (11.13) 

then the light cones are the hyper surfaces, for which 

lengthy (At, Ax, Ay, Az) = 0. (11.14) 

This “length” is called the interval. 

What is the class of transformations of time and space coordi- 
nates that leave the Maxwell-Lorentz theory invariant? The trans- 
formations that preserve the wave equation are exactly those that 
leave its characteristics invariant. We consider a transformation 
u = A{u') of time and space coordinates: 


t = A°(t' ,x' ,y' ,z'), 

(11.15) 

B 

II 

A 

(11.16) 

V = A 2 {t' ,x' ,y' ,z), 

(11.17) 

z = A 3 (t' , x' , y' , z'). 

(11.18) 

If we define a new field ip(t' , x' , y' , z') such that ip 

O 

o 

-e- 

A* 

"a 

"b 

-e- 

II 

"h 

(11.19) 

then p’ will satisfy the wave equation 


d 2 ip(u') d 2 ip(u') d 2 ip{u') 1 d 2 ip(u') 

dx' 2 ' dy' 2 ' dz' 2 c 2 dt' 2 

(11.20) 

if and only if 


length^') = lengthy (ZM £') = length,,^). 

(11.21) 


But this is just a statement that the velocity of light is invariant 
under change of the coordinate system. The class of transforma- 
tions that satisfy equation (11.21) are the Poincare transforma- 
tions. 


3 Hore the length is independent of the spacetime point specified by u. In 
General Relativity we find that the metric, and thus the length function needs 
to vary with the point in spacetime. 
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11.1 Lorentz Transformations 

Special relativity is usually presented in terms of global Lorentz 
frames, with rectangular spatial coordinates. In this context the 
Lorentz transformations (and, more generally, the Poincare trans- 
formations) can be characterized as the set of affine transforma- 
tions (linear transformations plus shift) of the coordinate tuple 
(time and spatial rectangular coordinates) that preserve the length 
of incremental spacetime intervals as measured by 

f (0 = -(£ 0 ) 2 + ( e 1 ) 2 + ( e 2 ) 2 + (£ 3 ) 2 , ( ii -22) 

where £ is an incremental 4-tuple that could be added to the coor- 
dinate 4-tuple ( ct , x , y, z). 4 The Poincare-Lorentz transformations 
are of the form 

x = Ax' + a, (11.23) 

where A is the tuple representation of a linear transformation and 
a is a 4-tuple shift. Because the 4-tuple includes the time, these 
transformations include transformations to a uniformly moving 
frame. A transformation that does not rotate or shift, but just 
introduces relative velocity, is sometimes called a boost. 

In general relativity, global Lorentz frames do not exist, and so 
global affine transformations are irrelevant. In general relativity 
Lorentz invariance is a local property of incremental 4-tuples at a 
point. 

Incremental 4-tuples transform as 


<1 

II 

(11.24) 

This places a constraint on the allowed A 


II 

> 

•tyy 

(11.25) 

for arbitrary £L 



The possible A that are consistent with the preservation of the 
interval can be completely specified and conveniently parameter- 
ized. 


4 Incrementally, £ = £°d/dct + t^d/dx + £ 2 d/dy + £ 3 d/dz. The length of 
this vector, using the Minkowski metric (see equation 10.11), is the Lorentz 
interval, the right-hand side of equation (11.22). 
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Simple Lorentz Transformations 

Consider the linear transformation, in the first two coordinates, 

?= p (?)° + q (?) 1 

e 1 =r(O 0 + KO 1 - (11-26) 

The requirement to preserve the interval gives the constraints 

p — r =1, 
pq — rs = 0, 

q 2 -s 2 = - 1. (11.27) 

There are four parameters to determine, and only three equations, 
so the solutions have a free parameter. It turns out that a good 
choice is (3 = q/p. Solve to find 

P = 7rhp =t03) ’ < 1L28 > 

and also p = s and q = r = f3p. This defines 7 . Written out, the 
transformation is 

e°=7 (PM'y+w) 1 ) 

e=imm°+(en (n.29) 

Simple physical arguments 5 show that this mathematical result 
relates the time and space coordinates for two systems in uniform 
relative motion. The parameter /3 is related to the relative velocity. 

Consider incremental vectors as spacetime vectors relative to an 
origin in a global inertial frame. So, for example, £ = (ct, x), ignor- 
ing y and z for a moment. The unprimed coordinate origin x = 0 
corresponds, in primed coordinates, to (using equations 11.29) 

x = 0 = 7 (/3)(x' + /3d/), (11.30) 


so 

P = 




(11.31) 


5 See, for instance, Mermin, “Space and Time in Special Relativity.” 
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with the definition v' = x'/t'. We see that f3 is minus 1/c times 
the velocity (v 1 ) of the unprimed system (which moves with its 
origin) as “seen” in the primed coordinates. 

To check the consistency of our interpretation, we can find the 
velocity of the origin of the primed system ( x ' = 0) as seen by the 
unprimed system. Using both of equations (11.29), we find 

13=- = -. (11.32) 

ct c 

So v' = — v . 

A consistent interpretation is that the origin of the primed sys- 
tem moves with velocity v = /3c along the x-axis of the unprimed 
system. And the unprimed system moves with the same velocity 
in the other direction, when viewed in terms of the primed system. 

What happened to the other coordinates: y and z? We did 
not need them to find this one-parameter family of Lorentz trans- 
formations. They are left alone. This mathematical result has a 
physical interpretation: Lengths are not affected by perpendicular 
boosts. Think about two observers on a collision course, each car- 
rying a meter stick perpendicular to their relative velocity. At the 
moment of impact, the meter sticks must coincide. The symmetry 
of the situation does not permit one observer to conclude that one 
meter stick is shorter than the other, because the other observer 
must come to the same conclusion. Both observers can put their 
conclusions to the test upon impact. 

We can fill in the components of this simple boost: 

e° = 7 (/3)((0° + m 1 ) 

e 1 = 7(W(0° + (O 1 ) 

£ 2 = (£') 2 

£ 3 = (£') 3 - (11-33) 

More General Lorentz Transformations 

One direction was special in our consideration of simple boosts. 
We can make use of this fact to find boosts in any direction. 

Let c/3 = (u°, u 1 , u 2 ) be the tuple of components of the relative 
velocity of the origin of the primed system in the unprimed system. 
The components are with respect to the same rectangular basis 
used to define the spatial components of any incremental vector. 
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An incremental vector can be decomposed into vectors parallel 
and perpendicular to the velocity. Let £ be the tuple of spatial 
components of £, and be the time component. Then, 

z = z ± + zK (n-34) 

where f3 ■ £ _L = 0. (This is the ordinary dot product in three 
dimensions.) Explicitly, 

= (1L35) 
where /3 = ||/3||, the magnitude of /3, and 

Z ± = £-^. (11.36) 

In the simple boost of equation (11.33) we can identify £ 1 with 
the magnitude |£^| of the parallel component. The perpendicular 
component is unchanged: 

l€ ll l = 7(/3)(/J(£') cl + l(£')"l), 

i 1 = ({'g- (11-37) 

Putting the components back together, this leads to 

£° = 7 (/ 3 )(( 0 ° + / 3-0 

£ = + $! + 7(/3 j~ W ■ a'), (n.38) 

which gives the components of the general boost B along velocity 
c/3: 

Z = B(0)(Z'). (11.39) 

Implementation 

We represent a 4-tuple as a flat up-tuple of components. 

(define (make-4tuple ct space) 

(up ct (ref space 0) (ref space 1) (ref space 2))) 
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(define (4tuple->ct v) (ref v 0)) 

(define (4tuple->space v) 

(up (ref v 1) (ref v 2) (ref v 3))) 

The invariant interval is then 

(define (proper-space-interval 4tuple) 

(sqrt (- (square (4tuple->space 4tuple)) 

(square (4tuple->ct 4tuple))))) 

This is a real number for space-like intervals. A space-like interval 
is one where spatial distance is larger than can be traversed by 
light in the time interval. 

It is often convenient for the interval to be real for time-like 
intervals, where light can traverse the spatial distance in less than 
the time interval. 

(define (proper-time-interval 4tuple) 

(sqrt (- (square (4tuple->ct 4tuple)) 

(square (4tuple->space 4tuple))))) 

The general boost B is 

(define ((general-boost beta) xi-p) 

(let ((gamma (expt (- 1 (square beta)) -1/2))) 

(let ((factor (/ (- gamma 1) (square beta)))) 

(let ((xi-p-time (4tuple->ct xi-p)) 

(xi-p-space (4tuple->space xi-p))) 

(let ( (beta-dot-xi-p (dot-product beta xi-p-space))) 
(make-4-tuple 

(* gamma (+ xi-p-time beta-dot-xi-p)) 

(+ (* gamma beta xi-p-time) 
xi-p-space 

(* factor beta beta-dot-xi-p)))))))) 

We can check that the interval is invariant: 

(- (proper-space-interval 

( (general -boost (up ’vx ’vy ’vz)) 

(make-4tuple ’ct (up ’x ’y ’z)))) 

(proper-space-interval 

(make-4tuple ’ct (up ’x ’y ’z))))) 

0 


It is inconvenient that the general boost as just defined does not 
work if (3 is zero. An alternate way to specify a boost is through 
the magnitude of v/c and a direction: 
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(define ( (general-boost2 direction v/c) 4tuple-prime) 

(let ( (delta-ct-prime (4tuple->ct 4tuple-prime) ) 

(delta-x-prime (4tuple->space 4tuple-prime) ) ) 

(let ((betasq (square v/c))) 

(let ((bx (dot-product direction delta-x-prime)) 

(gamma (/ 1 (sqrt (- 1 betasq))))) 

(let ((alpha (- gamma 1))) 

(let ((delta-ct 

(* gamma (+ delta-ct-prime (* bx v/c)))) 
(delta-x 

(+ (* gamma v/c direction delta-ct-prime) 
delta-x-prime 
(* alpha direction bx)))) 

(make-4tuple delta-ct delta-x))))))) 

This is well behaved as v/c goes to zero. 

Rotations 

A linear transformation that does not change the magnitude of 
the spatial and time components, individually, leaves the interval 
invariant. So a transformation that rotates the spatial coordinates 
and leaves the time component unchanged is also a Lorentz trans- 
formation. Let R be a 3-dimensional rotation. Then the extension 
to a Lorentz transformation TZ is defined by 

(£°, 0 = n(Rm')°, O = ((0°, W))- (11-40) 

Examining the expression for the general boost, equation (11.38), 
we see that the boost transforms simply as the arguments are ro- 
tated. Indeed, 

B((3) = (7£(i?)) _1 o B(R{(3)) o R(R). (11.41) 

Note that (7?.(i?)) _1 = ^(i? -1 ). The functional inverse of the 
extended rotation is the extension of the inverse rotation. We 
could use this property of boosts to think of the general boost as 
a combination of a rotation and a simple boost along some special 
direction. 

The extended rotation can be implemented: 

(define ((extended-rotation R) xi) 

(make-4tuple 
(4tuple->ct xi) 

(R (4tuple->space xi)))) 
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In terms of this we can check the relation between boosts and 
rotations: 

(let ((beta (up ’bx ’by ’bz)) 

(xi (make-4tuple ’ct (up ’x ’y ’z))) 

(R (compose 

(rotate-x ’theta) 

(rotate-y ’phi) 

(rotate-z ’psi))) 

(R-inverse (compose 

(rotate-z (- ’psi)) 

(rotate-y (- ’phi)) 

(rotate-x (- ’theta))))) 

(- ((general-boost beta) xi) 

((compose (extended-rotation R-inverse) 

(general-boost (R beta)) 

(extended-rotation R) ) 

xi))) 

(up 0 0 0 0) 

General Lorentz Transformations 

A Lorentz transformation carries an incremental 4-tuple to an- 
other 4-tuple. A general linear transformation on 4-tuples has 
sixteen free parameters. The interval is a symmetric quadratic 
form, so the requirement that the interval be preserved places only 
ten constraints on these parameters. Evidently there are six free 
parameters to the general Lorentz transformation. We already 
have three parameters that specify boosts (the three components 
of the boost velocity). And we have three more parameters in the 
extended rotations. The general Lorentz transformation can be 
constructed by combining generalized rotations and boosts. 

Any Lorentz transformation has a unique decomposition as a 
generalized rotation followed by a general boost. Any A that pre- 
serves the interval can be written uniquely: 

A = B(/3)1Z. (11.42) 

We can use property (11.41) to see this. Suppose we follow a 
general boost by a rotation. A new boost can be defined to ab- 
sorb this rotation, but only if the boost is preceded by a suitable 
rotation: 


n(R) o B((3) = B{R{f3)) O n{R). 


(11.43) 
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Exercise 11.1: Lorentz Decomposition 

The counting of free parameters supports the conclusion that the gen- 
eral Lorentz transformation can be constructed by combining general- 
ized rotations and boosts. Then the decomposition (11.42) follows from 
property (11.41). Find a more convincing proof. 


11.2 Special Relativity Frames 

A new frame is defined by a Poincare transformation from a given 
frame (see equation 11.23). The transformation is specified by 
a boost magnitude and a unit-vector boost direction, relative to 
the given frame, and the position of the origin of the frame being 
defined in the given frame. 

Points in spacetime are called events. It must be possible to 
compare two events to determine if they are the same. This is 
accomplished in any particular experiment by building all frames 
involved in that experiment from a base frame, and representing 
the events as coordinates in that base frame. 

When one frame is built upon another, to determine the event 
from frame-specific coordinates or to determine the frame-specific 
coordinates for an event requires composition of the boosts that 
relate the frames to each other. The two procedures that are 
required to implement this strategy are 6 

(define ( (coordinates->event ancestor-frame this-frame 

boost-direction v/c origin) 

coords) 

((point ancestor-frame) 

(make-SR-coordinates ancestor-frame 

(+ ( (general-boost2 boost-direction v/c) coords) 
origin)))) 

(define ( (event->coordinates ancestor-frame this-frame 

boost-direction v/c origin) 

event ) 

(make-SR-coordinates this-frame 

( (general -boost2 (- boost-direction) v/c) 

(- ((chart ancestor-frame) event) origin)))) 


6 The procedure make-SR-coordinates labels the given coordinates with the 
given frame. The procedures that manipulate coordinates, such as (point 
ancestor-frame), check that the coordinates they are given are in the appro- 
priate frame. This error checking makes it easier to debug relativity proce- 
dures. 
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With these two procedures, the procedure make-SR-f rame con- 
structs a new relativistic frame by a Poincare transformation from 
a given frame. 

(define make-SR-f rame 

(frame-maker coordinates->event event->coordinates) ) 

Velocity Addition Formula 

For example, we can derive the traditional velocity addition for- 
mula. Assume that we have a base frame called home. We can 
make a frame A by a boost from home in the x direction, with com- 
ponents (1,0,0), and with a dimensionless measure of the speed 
v a /c. We also specify that the 4-tuple origin of this new frame 
coincides with the origin of home. 

(define A 

(make-SR-f rame ’A home 

(up 10 0) 

’va/c 

(make-SR-coordinates home (up 0 0 0 0)))) 

Frame B is built on frame A similarly, boosted by Vb/c. 

(define B 

(make-SR-f rame ’B A 

(up 10 0) 

’vb/c 

(make-SR-coordinates A (up 0 0 0 0) ) ) ) 

So any point at rest in frame B will have a speed relative 
to home. For the spatial origin of frame B, with B coordinates 
(up ’ct 0 0 0), we have 

(let ( (B-origin-home-coords 
((chart home) 

((point B) 

(make-SR-coordinates B (up ’ct 0 0 0)))))) 

(/ (ref B-origin-home-coords 1) 

(ref B-origin-home-coords 0))) 

(/ (+ va/c vb/c) (+ 1 (* va/c vb/c))) 

obtaining the traditional velocity-addition formula. (Note that 
the resulting velocity is represented as a fraction of the speed of 
light.) This is a useful result, so: 
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(define (add-v/cs va/c vb/c) 
(/ (+ va/c vb/c) 

(+ 1 (* va/c vb/c)))) 


11.3 Twin Paradox 

Special relativity engenders a traditional conundrum: consider 
two twins, one of whom travels and the other stays at home. When 
the traveller returns it is discovered that the traveller has aged less 
than the twin who stayed at home. How is this possible? 

The experiment begins at the start event, which we arbitrarily 
place at the origin of the home frame. 

(define start-event 
((point home) 

(make-SR-coordinates home (up 0 0 0 0)))) 

There is a homebody and a traveller. The traveller leaves home 
at the start event and proceeds at 24/25 of the speed of light in 
the x direction. We define a frame for the traveller, by boosting 
from the home frame. 

(define outgoing 

(make-SR-frame ’outgoing 
home 

(up 10 0) 

24/25 

((chart home) 
start-event) ) ) 

After 25 years of home time the traveller is 24 light-years out. 
We define that event using the coordinates in the home frame. 
Here we scale the time coordinate by the speed of light so that 
the units of ct slot in the 4-vector are the same as the units in 
the spatial slots. Since v/c = 24/25 we must multiply that by the 
speed of light to get the velocity. This is multiplied by 25 years 
to get the x coordinate of the traveller in the home frame at the 
turning point. 

(define traveller-at-turning-point-event 
((point home) 

(make-SR-coordinates home 

(up (* : c 25) (* 25 24/25 :c) 0 0)))) 


for debugging 
base frame 
x direction 

velocity as fraction of c 
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Note that the first component of the coordinates of an event is 
the speed of light multiplied by time. The other components are 
distances. For example, the second component (the x component) 
is the distance travelled in 25 years at 24/25 the speed of light. 
This is 24 light-years. 

If we examine the displacement of the traveller in his own frame 
we see that the traveller has aged 7 years and he has not moved 
from his spatial origin. 

(- ((chart outgoing) traveller-at-turning-point-event) 

((chart outgoing) start-event)) 

(up ( * 7 :c) 0 0 0 ) 

But in the frame of the homebody we see that the time has ad- 
vanced by 25 years. 

(- ((chart home) traveller-at -turning-point-event) 

((chart home) start-event)) 

(up (* 25 :c) (* 24 :c) 0 0) 


The proper time interval is 7 years, as seen in any frame, because 
it measures the aging of the traveller: 

(proper-time-interval 

(- ((chart outgoing) traveller-at-turning-point-event) 

((chart outgoing) start-event))) 

( * 7 :c) 

(proper-time-interval 

(- ((chart home) traveller-at -turning-point-event) 

((chart home) start-event))) 

( * 7 :c) 

When the traveller is at the turning point, the event of the 
homebody is: 

(define halfway-at-home-event 
((point home) 

(make-SR-coordinates home (up (* :c 25) 0 0 0)))) 
and the homebody has aged 

(proper-time-interval 
(- ((chart home) halfway-at-home-event) 

((chart home) start-event))) 

(* 25 : c ) 
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(proper-time-interval 

(- ((chart outgoing) halfway-at-home-event) 

((chart outgoing) start-event))) 

(* 25 : c ) 

as seen from either frame. 

As seen by the traveller, home is moving in the — x direction at 
24/25 of the velocity of light. At the turning point (7 years by his 
time) home is at: 

(define home-at-outgoing-turning-point-event 
( (point outgoing) 

(make-SR-coordinates outgoing 

(up (* 7 : c) (* 7 -24/25 :c) 0 0)))) 

Since home is speeding away from the traveller, the twin at 
home has aged less than the traveller. This may seem weird, but 
it is OK because this event is different from the halfway event in 
the home frame. 

(proper-time-interval 

(- ((chart home) home-at-outgoing-turning-point-event) 

((chart home) start-event))) 

(* 49/25 : c ) 

The traveller turns around abruptly at this point (painful!) and 
begins the return trip. The incoming trip is the reverse of the 
outgoing trip, with origin at the turning-point event: 

(define incoming 

(make-SR-frame ’incoming home 

(up -1 0 0) 24/25 
((chart home) 

traveller-at-turning-point-event) ) ) 

After 50 years of home time the traveller reunites with the 
homebody: 

(define end-event 
((point home) 

(make-SR-coordinates home (up (* :c 50) 0 0 0)))) 

Indeed, the traveller comes home after 7 more years in the in- 
coming frame: 
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(- ((chart incoming) end-event) 

(make-SR-coordinates incoming 
(up (* :c 7) 0 0 0))) 

(up 0 0 0 0) 

(- ((chart home) end-event) 

((chart home) 

( (point incoming) 

(make-SR-coordinates incoming 
(up (* :c 7) 000))))) 

(up 0 0 0 0) 

The traveller ages only 7 years on the return segment, so his 
total aging is 14 years: 

(+ (proper-time-interval 

(- ((chart outgoing) traveller-at-turning-point-event) 
((chart outgoing) start-event))) 

(proper-time-interval 
(- ((chart incoming) end-event) 

((chart incoming) traveller-at-turning-point-event) ) ) ) 

(* 14 :c) 

But the homebody ages 50 years: 

(proper-time-interval 
(- ((chart home) end-event) 

((chart home) start-event))) 

(* 50 : c ) 

At the turning point of the traveller the homebody is at 

(define home-at-incoming-turning-point-event 
( (point incoming) 

(make-SR-coordinates incoming 
(up 0 (* 7 -24/25 : c) 0 0)))) 

The time elapsed for the homebody between the reunion and 
the turning point of the homebody, as viewed by the incoming 
traveller, is about 2 years. 

(proper-time-interval 
(- ((chart home) end-event) 

((chart home) home-at-incoming-turning-point-event) ) ) 

(* 49/25 : c ) 

Thus the aging of the homebody occurs at the turnaround, from 
the point of view of the traveller. 




A 

Scheme 


Programming languages should be designed not by 
piling feature on top of feature, but by removing 
the weaknesses and restrictions that make 
additional features appear necessary. Scheme 
demonstrates that a very small number of rules for 
forming expressions, with no restrictions on how 
they are composed, suffice to form a practical and 
efficient programming language that is flexible 
enough to support most of the major programming 
paradigms in use today. 

IEEE Standard for the Scheme Programming 
Language [10], p. 3 


Here we give an elementary introduction to Scheme. 1 For a more 
precise explanation of the language see the IEEE standard [10]; 
for a longer introduction see the textbook [1], 

Scheme is a simple programming language based on expressions. 
An expression names a value. For example, the numeral 3.14 
names an approximation to a familiar number. There are primitive 
expressions, such as a numeral, that we directly recognize, and 
there are compound expressions of several kinds. 

Procedure Calls 

A procedure call is a kind of compound expression. A procedure 
call is a sequence of expressions delimited by parentheses. The 
first subexpression in a procedure call is taken to name a proce- 
dure, and the rest of the subexpressions are taken to name the 
arguments to that procedure. The value produced by the proce- 
dure when applied to the given arguments is the value named by 
the procedure call. For example, 


1 Many of the statements here are valid only assuming that no assignments are 
used. 
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(+ 1 2.14) 

3.14 

(+ 1 (* 2 1.07)) 

3.14 

are both compound expressions that name the same number as 
the numeral 3. 14. 2 In these cases the symbols + and * name 
procedures that add and multiply, respectively. If we replace any 
subexpression of any expression with an expression that names 
the same thing as the original subexpression, the thing named by 
the overall expression remains unchanged. In general, a procedure 
call is written 

( operator operand- 1 . . . operand-n ) 

where operator names a procedure and operand-i names the ith 
argument. 3 

Lambda Expressions 

Just as we use numerals to name numbers, we use A-expressions 
to name procedures. 4 For example, the procedure that squares its 
input can be written: 

(lambda (x) (* x x) ) 

This expression can be read: “The procedure of one argument, x, 
that multiplies x by x.” Of course, we can use this expression in 
any context where a procedure is needed. For example, 

((lambda (x) (* x x) ) 4) 

16 

The general form of a A-expression is 


2 In examples we show the value that would be printed by the Scheme system 
using slanted characters following the input expression. 

3 In Scheme every parenthesis is essential: you cannot add extra parentheses 
or remove any. 

4 The logician Alonzo Church [5] invented A-notation to allow the specification 
of an anonymous function of a named parameter: Xx [expression in x]. This 
is read, “That function of one argument that is obtained by substituting the 
argument for x in the indicated expression.” 
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(lambda formal-parameters body ) 

where formal-parameters is a list of symbols that will be the names 
of the arguments to the procedure and body is an expression that 
may refer to the formal parameters. The value of a procedure 
call is the value of the body of the procedure with the arguments 
substituted for the formal parameters. 

Definitions 

We can use the define construct to give a name to any object. 
For example, if we make the definitions 5 

(define pi 3.141592653589793) 

(define square (lambda (x) (* x x) ) ) 

we can then use the symbols pi and square wherever the numeral 
or the A-expression could appear. For example, the area of the 
surface of a sphere of radius 5 meters is 

(* 4 pi (square 5)) 

314.1592653589793 

Procedure definitions may be expressed more conveniently using 
“syntactic sugar.” The squaring procedure may be defined 

(define (square x) (* x x) ) 

which we may read: “To square x multiply x by x. n 

In Scheme, procedures may be passed as arguments and re- 
turned as values. For example, it is possible to make a procedure 
that implements the mathematical notion of the composition of 
two functions: 6 


*The definition of square given here is not the definition of square in the 
Scmutils system. In Scmutils, square is extended for tuples to mean the sum 
of the squares of the components of the tuple. However, for arguments that 
are not tuples the Scmutils square does multiply the argument by itself. 

6 The examples are indented to help with readability. Scheme does not care 
about extra white space, so we may add as much as we please to make things 
easier to read. 
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(define compose 
(lambda (f g) 

(lambda (x) 

(f (g x))))) 

((compose square sin) 2) 

. 826821810431806 

(square (sin 2)) 

. 826821810431806 

Using the syntactic sugar shown above, we can write the defini- 
tion more conveniently. The following are both equivalent to the 
definition above: 

(define (compose f g) 

(lambda (x) 

(f (g x)))) 

(define ((compose f g) x) 

(f (g x))) 

Conditionals 

Conditional expressions may be used to choose among several ex- 
pressions to produce a value. For example, a procedure that im- 
plements the absolute value function may be written: 

(define (abs x) 

(cond ((< x 0) (- x)) 

((= x 0) x) 

((> x 0) x))) 

The conditional cond takes a number of clauses. Each clause has 
a predicate expression, which may be either true or false, and a 
consequent expression. The value of the cond expression is the 
value of the consequent expression of the first clause for which the 
corresponding predicate expression is true. The general form of a 
conditional expression is 

(cond ( predicate-1 consequent- 1) 

( predicate-n consequent-n ) ) 

For convenience there is a special predicate expression else that 
can be used as the predicate in the last clause of a cond. The if 
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construct provides another way to make a conditional when there 
is only a binary choice to be made. For example, because we have 
to do something special only when the argument is negative, we 
could have defined abs as: 

(define (abs x) 

(if (< x 0) 

(- x) 
x)) 

The general form of an if expression is 
(if predicate consequent alternative ) 

If the predicate is true the value of the if expression is the value 
of the consequent , otherwise it is the value of the alternative. 

Recursive Procedures 

Given conditionals and definitions, we can write recursive proce- 
dures. For example, to compute the nth factorial number we may 
write: 

(define (factorial n) 

(if (= n 0) 

1 

(* n (factorial (- n 1))))) 

(factorial 6) 

720 

(factorial 40) 

815915283247897734345611269596115894272000000000 

Local Names 

The let expression is used to give names to objects in a local 
context. For example, 

(define (f radius) 

(let ((area (* 4 pi (square radius))) 

(volume (* 4/3 pi (cube radius)))) 

(/ volume area))) 

(f 3) 

1 
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The general form of a let expression is 

(let (( variable-1 expression- 1) 

( variable-n expression-n ) ) 
body ) 

The value of the let expression is the value of the body expression 
in the context where the variables variable-i have the values of 
the expressions expression-i. The expressions expression-i may 
not refer to any of the variables. 

A slight variant of the let expression provides a convenient 
way to express looping constructs. We can write a procedure that 
implements an alternative algorithm for computing factorials as 
follows: 

(define (factorial n) 

(let factlp ((count 1) (answer 1)) 

(if (> count n) 
answer 

(factlp (+ count 1) (* count answer))))) 

(factorial 6) 

720 

Here, the symbol factlp following the let is locally defined to be 
a procedure that has the variables count and answer as its formal 
parameters. It is called the first time with the expressions 1 and 1, 
initializing the loop. Whenever the procedure named factlp is 
called later, these variables get new values that are the values of 
the operand expressions (+ count 1) and (* count answer). 

Compound Data — Lists and Vectors 

Data can be glued together to form compound data structures. A 
list is a data structure in which the elements are linked sequen- 
tially. A Scheme vector is a data structure in which the elements 
are packed in a linear array. New elements can be added to lists, 
but to access the nth element of a list takes computing time pro- 
portional to n. By contrast a Scheme vector is of fixed length, and 
its elements can be accessed in constant time. All data structures 
in this book are implemented as combinations of lists and Scheme 
vectors. Compound data objects are constructed from compo- 
nents by procedures called constructors and the components are 
accessed by selectors. 
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The procedure list is the constructor for lists. The selector 
list-ref gets an element of the list. All selectors in Scheme are 
zero-based. For example, 

(define a-list (list 6 946 8 356 12 620)) 

a-list 

(6 946 8 356 12 620 ) 

(list-ref a-list 3) 

356 

(list-ref a-list 0) 

6 


Lists are built from pairs. A pair is made using the constructor 
cons. The selectors for the two components of the pair are car and 
cdr (pronounced “could-er ”). 7 A list is a chain of pairs, such that 
the car of each pair is the list element and the cdr of each pair is 
the next pair, except for the last cdr, which is a distinguishable 
value called the empty list and is written (). Thus, 

(car a-list) 

6 

(cdr a-list) 

(946 8 356 12 620 ) 

(car (cdr a-list)) 

946 

(define another-list 
(cons 32 (cdr a-list))) 

another-list 

(32 946 8 356 12 620 ) 

(car (cdr another-list)) 

946 

Both a-list and another-list share the same tail (their cdr). 


7 These names are accidents of history. They stand for “Contents of the Ad- 
dress part of Register” and “Contents of the Decrement part of Register” of 
the IBM 704 computer, which was used for the first implementation of Lisp 
in the late 1950s. Scheme is a dialect of Lisp. 
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There is a predicate pair? that is true of pairs and false on all 
other types of data. 

Vectors are simpler than lists. There is a constructor vector 
that can be used to make vectors and a selector vector-ref for 
accessing the elements of a vector: 

(define a-vector 

(vector 37 63 49 21 88 56)) 

a-vector 

#(37 63 49 21 88 56 ) 

(vector-ref a-vector 3) 

21 

(vector-ref a-vector 0) 

37 

Notice that a vector is distinguished from a list on printout by the 
character / appearing before the initial parenthesis. 

There is a predicate vector? that is true of vectors and false 
for all other types of data. 

The elements of lists and vectors may be any kind of data, 
including numbers, procedures, lists, and vectors. Numerous 
other procedures for manipulating list-structured data and vector- 
structured data can be found in the Scheme online documentation. 

Symbols 

Symbols are a very important kind of primitive data type that we 
use to make programs and algebraic expressions. You probably 
have noticed that Scheme programs look just like lists. In fact, 
they are lists. Some of the elements of the lists that make up 
programs are symbols, such as + and vector. 8 If we are to make 
programs that can manipulate programs, we need to be able to 
write an expression that names such a symbol. This is accom- 
plished by the mechanism of quotation. The name of the symbol 
+ is the expression ’+, and in general the name of an expression 
is the expression preceded by a single quote character. Thus the 
name of the expression (+ 3 a) is ’ (+ 3 a) . 


8 Symbols may have any number of characters. A symbol may not contain 
whitespace or a delimiter character, such as parentheses, brackets, quotation 
marks, comma, or 
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We can test if two symbols are identical by using the predicate 
eq?. For example, we can write a program to determine if an 
expression is a sum: 

(define (sum? expression) 

(and (pair? expression) 

(eq? (car expression) ’+))) 

(sum? ’ (+ 3 a) ) 

#t 

(sum? ’ (* 3 a) ) 

#f 

Here #t and #f are the printed representations of the boolean 
values true and false. 

Consider what would happen if we were to leave out the quote in 
the expression (sum? ’ (+ 3 a) ) . If the variable a had the value 4 
we would be asking if 7 is a sum. But what we wanted to know 
was whether the expression (+ 3 a) is a sum. That is why we 
need the quote. 





B 

Our Notation 


An adequate notation should be understood by at 
least two people, one of whom may be the author. 

Abdus Salam (1950). 


We adopt a functional mathematical notation that is close to that 
used by Spivak in his Calculus on Manifolds [17]. The use of 
functional notation avoids many of the ambiguities of traditional 
mathematical notation that can impede clear reasoning. Func- 
tional notation carefully distinguishes the function from the value 
of the function when applied to particular arguments. In func- 
tional notation mathematical expressions are unambiguous and 
self-contained. 

We adopt a generic arithmetic in which the basic arithmetic 
operations, such as addition and multiplication, are extended to 
a wide variety of mathematical types. Thus, for example, the ad- 
dition operator + can be applied to numbers, tuples of numbers, 
matrices, functions, etc. Generic arithmetic formalizes the com- 
mon informal practice used to manipulate mathematical objects. 

We often want to manipulate aggregate quantities, such as the 
collection of all of the rectangular coordinates of a collection of 
particles, without explicitly manipulating the component parts. 
Tensor arithmetic provides a traditional way of manipulating ag- 
gregate objects: Indices label the parts; conventions, such as the 
summation convention, are introduced to manipulate the indices. 
We introduce a tuple arithmetic as an alternative way of manipu- 
lating aggregate quantities that usually lets us avoid labeling the 
parts with indices. Tuple arithmetic is inspired by tensor arith- 
metic but it is more general: not all of the components of a tuple 
need to be of the same size or type. 

The mathematical notation is in one-to-one correspondence 
with expressions of the computer language Scheme [10]. Scheme 
is based on the A-calculus [5] and directly supports the manipula- 
tion of functions. We augment Scheme with symbolic, numerical, 
and generic features to support our applications. For a simple 
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introduction to Scheme, see Appendix A. The correspondence be- 
tween the mathematical notation and Scheme requires that math- 
ematical expressions be unambiguous and self-contained. Scheme 
provides immediate feedback in verification of mathematical de- 
ductions and facilitates the exploration of the behavior of systems. 

Functions 

The expression f(x) denotes the value of the function / at the 
given argument x] when we wish to denote the function we write 
just /. Functions may take several arguments. For example, we 
may have the function that gives the Euclidean distance between 
two points in the plane given by their rectangular coordinates: 

d(x 1 ,yi,X 2 ,V 2 ) = \j{x 2 - xi) 2 + (y 2 - yi) 2 . (B.l) 

In Scheme we can write this as: 

(define (d xl yl x2 y2) 

(sqrt (+ (square (- x2 xl)) (square (- y2 yl))))) 

Functions may be composed if the range of one overlaps the 
domain of the other. The composition of functions is constructed 
by passing the output of one to the input of the other, 
the composition of two functions using the o operator: 

(fog):x^(fo g)( x ) = f(g(x)). 

A procedure h that computes the cube of the sine of its 
may be defined by composing the procedures cube and 

(define h (compose cube sin)) 

(h 2) 

. 7518269446689928 

which is the same as 

(cube (sin 2)) 

. 7518269446689928 

Arithmetic is extended to the manipulation of functions: the 
usual mathematical operations may be applied to functions. Ex- 
amples are addition and multiplication; we may add or multiply 
two functions if they take the same kinds of arguments and if their 


We write 

(B.2) 

argument 

sin: 
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values can be added or multiplied: 

(. f + g){x) = f(x) + g(x), 

(. fg){x ) = f(x)g(x). (B.3) 

A procedure g that multiplies the cube of its argument by the sine 
of its argument is 

(define g (* cube sin)) 

(g 2) 

7.274379414605454 

(* (cube 2) (sin 2)) 

7.274379414605454 

Symbolic Values 

As in usual mathematical notation, arithmetic is extended to al- 
low the use of symbols that represent unknown or incompletely 
specified mathematical objects. These symbols are manipulated 
as if they had values of a known type. By default, a Scheme 
symbol is assumed to represent a real number. So the expression 
’a is a literal Scheme symbol that represents an unspecified real 
number: 

((compose cube sin) ’a) 

(expt (sin a) 3) 


The default printer simplifies the expression, 1 and displays it in a 
readable form. We can use the simplifier to verify a trigonometric 
identity: 

((- (+ (square sin) (square cos)) 1) ’a) 

0 


Just as it is useful to be able to manipulate symbolic numbers, 
it is useful to be able to manipulate symbolic functions. The 
procedure literal-function makes a procedure that acts as a 
function having no properties other than its name. By default, a 
literal function is defined to take one real argument and produce 


lr The procedure print-expression can be used in a program to print a sim- 
plified version of an expression. The default printer in the user interface 
incorporates the simplifier. 
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one real value. For example, we may want to work with a function 
/ : R — ^ R: 

((literal-function ’f) ’x) 

(f x) 

((compose (literal-function ’f) (literal-function ’g)) ’x) 

(f (9 x)) 

We can also make literal functions of multiple, possibly struc- 
tured arguments that return structured values. For example, to 
denote a literal function named g that takes two real arguments 
and returns a real value (g : R x R — >• R) we may write: 

(define g (literal-function ’ g (-> (X Real Real) Real))) 

(g ’x ’y) 

(9 x y) 

We may use such a literal function anywhere that an explicit func- 
tion of the same type may be used. 

There is a whole language for describing the type of a literal 
function in terms of the number of arguments, the types of the 
arguments, and the types of the values. Here we describe a func- 
tion that maps pairs of real numbers to real numbers with the 
expression (-> (X Real Real) Real). Later we will introduce 
structured arguments and values and show extensions of literal 
functions to handle these. 

Tuples 

There are two kinds of tuples: up tuples and down tuples. We 
write tuples as ordered lists of their components; a tuple is de- 
limited by parentheses if it is an up tuple and by square brackets 
if it is a down tuple. For example, the up tuple v of velocity 
components v°, v 1 , and v 2 is 

v = yyV). (b.4) 

The down tuple p of momentum components po, pi, and P 2 is 

P=\P0,Pi,P2\- (B.5) 

A component of an up tuple is usually identified by a superscript. 
A component of a down tuple is usually identified by a subscript. 
We use zero-based indexing when referring to tuple elements. This 
notation follows the usual convention in tensor arithmetic. 
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We make tuples with the constructors up and down: 

(define v (up ’v~0 ’v~l ’v~2)) 
v 

(up v~0 v~l v~2) 

(define p (down ; p_0 ’p_l ’p_2)) 

P 

(down p.O p.l P-2) 

Note that v~0 and p_2 are just symbols. The caret and underline 
characters are symbol constituents, so there is no meaning other 
than mnemonic to the structure of these symbols. However, our 
software can also display expressions using TeX, and then these 
decorations turn into superscripts and subscripts. 

Tuple arithmetic is different from the usual tensor arithmetic 
in that the components of a tuple may also be tuples and different 
components need not have the same structure. For example, a 
tuple structure s of phase-space states is 

s = (t, (x,y ) , \p x ,py}) • (B.6) 

It is an up tuple of the time, the coordinates, and the momenta. 
The time t has no substructure. The coordinates are an up tuple 
of the coordinate components x and y. The momentum is a down 
tuple of the momentum components p x and p y . In Scheme this is 
written: 

(define s (up ’t (up ’x ’y) (down ’p_x ; p_y))) 

In order to reference components of tuple structures there are 
selector functions, for example: 

I(s) = s 
h (s) = t 
h(s) = (x,y) 
h(s) = \p x ,Py } 
h,o(s) = x 

hi (s)= iV (B.7) 

The sequence of integer subscripts on the selector describes the 
access chain to the desired component. 
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The procedure component is the general selector procedure that 
implements the selector function I z : 

((component 0 1) (up (up ’a ’b) (up ’c ’d))) 
b 

To access a component of a tuple we may also use the selector 
procedure ref, which takes a tuple and an index and returns the 
indicated element of the tuple: 

(ref (up ’a ’b ’ c) 1) 
b 

We use zero-based indexing everywhere. The procedure ref can 
be used to access any substructure of a tree of tuples: 

(ref (up (up ’a ; b) (up ’ c ; d)) 0 1) 
b 


Two up tuples of the same length may be added or subtracted, 
elementwise, to produce an up tuple, if the components are com- 
patible for addition. Similarly, two down tuples of the same length 
may be added or subtracted, elementwise, to produce a down tu- 
ple, if the components are compatible for addition. 

Any tuple may be multiplied by a number by multiplying each 
component by the number. Numbers may, of course, be mul- 
tiplied. Tuples that are compatible for addition form a vector 
space. 

For convenience we define the square of a tuple to be the sum 
of the squares of the components of the tuple. Tuples can be 
multiplied, as described below, but the square of a tuple is not 
the product of the tuple with itself. 

The meaning of multiplication of tuples depends on the struc- 
ture of the tuples. Two tuples are compatible for contraction if 
they are of opposite types, they are of the same length, and cor- 
responding elements have the following property: either they are 
both tuples and are compatible for contraction, or at least one 
is not a tuple. If two tuples are compatible for contraction then 
generic multiplication is interpreted as contraction: the result is 
the sum of the products of corresponding components of the tu- 
ples. For example, p and v introduced in equations (B.4) and (B.5) 
above are compatible for contraction; the product is 

pv = pqv° + p\v l + P 2 V 2 - (B.8) 




Appendix B Our Notation 


201 


So the product of tuples that are compatible for contraction is an 
inner product. Using the tuples p and v defined above gives us 

(* p v) 

(+ (* p.O v~0) (* p.l v'l) (* p.2 v~2)) 


Contraction of tuples is commutative: pv = vp. Caution: Mul- 
tiplication of tuples that are compatible for contraction is, in gen- 
eral, not associative. For example, let u = (5, 2), v = (11, 13), and 
g = [[3,5] , [7,9]]. Then u(gv) = 964, but ( ug)v = 878. The ex- 
pression ugv is ambiguous. An expression that has this ambiguity 
does not occur in this book. 

The rule for multiplying two structures that are not compati- 
ble for contraction is simple. If A and B are not compatible for 
contraction, the product AB is a tuple of type B whose compo- 
nents are the products of A and the components of B. The same 
rule is applied recursively in multiplying the components. So if 
B = (B°, B l ,B 2 ), the product of A and B is 

AB = (. AB° , AB 1 , AB 2 ) . (B.9) 

If A and C are not compatible for contraction and C = [Co, C\, C 2 ], 
the product is 


AC = [AC q,ACi,AC 2 ]. 


(B.10) 


Tuple structures can be made to represent linear transforma- 
tions. For example, the rotation commonly represented by the 
matrix 


' cos 6 — sin 8 ' 

. sin 8 cos 8 . 


(B.ll) 


can be represented as a tuple structure: 2 

' / cos 8 \ / — sin 8 \ ' 

. \ sin 8 ) \ cos 8 ) . 


(B.12) 


2 To emphasize the relationship of simple tuple structures to matrix notation 
we often format up tuples as vertical arrangements of components and down 
tuples as horizontal arrangements of components. However, we could just as 
well have written this tuple as [(cos 9, sin 9) , (— sin 9, cos 9)] . 
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Such a tuple is compatible for contraction with an up tuple that 
represents a vector. So, for example: 


/ cos 9 W — sin 9 \ 1 / x \ _ / x cos 9 — y sin 9 \ 

V sin 9 ) V cos 9 ) J V y ) V x sin 9 + y cos 9 ) ' 


(B.13) 


The product of two tuples that represent linear transformations — 
which are not compatible for contraction — represents the compo- 
sition of the linear transformations. For example, the product of 
the tuples representing two rotations is 


\ / — sin 6 \ ' 

' / cos p \ / — sin p 

) V cos 9 ) . 

. V sin p ) V cos p 

' / COS (9 + Lp ) ' 

\ ( ~ sin(0 + p) V 

. V sin(0 + cp ) , 

1 V cos(6* + p) ) . 


(B- 14) 


Multiplication of tuples that represent linear transformations is as- 
sociative but generally not commutative, just as the composition 
of the transformations is associative but not generally commuta- 
tive. 


Derivatives 

The derivative of a function / is a function, denoted by Df. Our 
notational convention is that D is a high-precedence operator. 
Thus D operates on the adjacent function before any other ap- 
plication occurs: Df(x) is the same as ( Df)(x ). Higher-order 
derivatives are described by exponentiating the derivative opera- 
tor. Thus the nth derivative of a function / is notated as D n f . 

The Scheme procedure for producing the derivative of a function 
is named D. The derivative of the sin procedure is a procedure that 
computes cos: 

(define derivative-of-sine (D sin)) 

(derivative-of-sine ’x) 

(cos x) 


The derivative of a function / is the function Df whose value 
for a particular argument is something that can be multiplied by 
an increment Ax in the argument to get a linear approximation 
to the increment in the value of /: 


f(x + Ax) R 2 f(x) + Df(x) Ax. 


(B.15) 
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For example, let / be the function that cubes its argument 
(/(x) = x 3 ); then Df is the function that yields three times 
the square of its argument ( Df(y ) = Ay 2 ). So /( 5) = 125 and 
11/(5) = 75. The value of / with argument x + Ax is 

f(x + Ax) = (x + Ax) 3 = x 3 + 3 x 2 Ax + 3xAx 2 + Ax 3 (B.16) 

and 

Df{x) Ax = 3x 2 Ax. (B.17) 

So Df(x) multiplied by Ax gives us the term in f(x + Ax) that is 
linear in Ax, providing a good approximation to /(x + Ax) — /(x) 
when Ax is small. 

Derivatives of compositions obey the chain rule: 


D(fog) = ((Df)og)-Dg. 

(B.18) 

So at x, 


(D(f o g))(x) = Df (g(x)) ■ Dg{x). 

(B.19) 


D is an example of an operator. An operator is like a function 
except that multiplication of operators is interpreted as composi- 
tion, whereas multiplication of functions is multiplication of the 
values (see equation B.3). If D were an ordinary function, then 
the rule for multiplication would imply that D 2 f would just be 
the product of Df with itself, which is not what is intended. A 
product of a number and an operator scales the operator. So, for 
example 

(((* 5 D) cos) ’x) 

( * -5 (sin x) ) 

Arithmetic is extended to allow manipulation of operators. A 
typical operator is (D + I)(D — I) = D 2 — I, where I is the identity 
operator, subtracts a function from its second derivative. Such an 
operator can be constructed and used in Scheme as follows: 

(((* (+ D I) (- D I)) (literal-function ’f)) ’x) 

(+ ( ( (expt D 2) f) x) (* -1 (f x))) 
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Derivatives of Functions of Multiple Arguments 

The derivative generalizes to functions that take multiple argu- 
ments. The derivative of a real- valued function of multiple argu- 
ments is an object whose contraction with the tuple of increments 
in the arguments gives a linear approximation to the increment in 
the function’s value. 

A function of multiple arguments can be thought of as a func- 
tion of an up tuple of those arguments. Thus an incremental ar- 
gument tuple is an up tuple of components, one for each argument 
position. The derivative of such a function is a down tuple of the 
partial derivatives of the function with respect to each argument 
position. 

Suppose we have a real-valued function g of two real-valued 
arguments, and we want to approximate the increment in the value 
of g from its value at x,y. If the arguments are incremented by 
the tuple (Ax, Ay) we compute: 

Dg(x,y ) ■ (Ax, Ay) = [d 0 g(x,y),dig(x,y)] ■ (Ax, Ay) 

= d 0 g(x,y)Ax + dig(x,y)Ay. (B.20) 

Using the two-argument literal function g defined on page 198, we 
have: 

( (D g) ’x ’y) 

(down (( (partial 0) g) x y) (( (partial 1) g) x y) ) 


In general, partial derivatives are just the components of the 
derivative of a function that takes multiple arguments (or struc- 
tured arguments or both; see below). So a partial derivative of a 
function is a composition of a component selector and the deriva- 
tive of that function. 3 Indeed: 

do g = h ° Dg, 
dig = h o Dg. 

Concretely, if 

g(x, y) = x 3 y 5 (B.23) 


(B.21) 

(B.22) 


3 Partial derivative operators such as (partial 2) are operators, so (expt 
(partial 1) 2) is a second partial derivative. 
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then 

Dg{x,y) = [3xV,5xV] (B.24) 

and the first-order approximation of the increment for changing 
the arguments by Ax and Ay is 

g(x + A x,y + Ay) - g(x, y) « [3x 2 y 5 , 5x 3 y 4 ] • (Ax, Ay) 

= 3x 2 y 5 Ax + 5x 3 y 4 Ay. (B.25) 

Partial derivatives of compositions also obey a chain rule: 

di(f ° g) = ((£>/) o g) ■ dig. (B.26) 

So if x is a tuple of arguments, then 

(di(f o g))(x) = Df(jg(x )) • %(x). (B.27) 

Mathematical notation usually does not distinguish functions 
of multiple arguments and functions of the tuple of arguments. 
Let h((x,y)) = g(x,y). The function h, which takes a tuple of 
arguments x and y, is not distinguished from the function g that 
takes arguments x and y. We use both ways of defining functions 
of multiple arguments. The derivatives of both kinds of functions 
are compatible for contraction with a tuple of increments to the 
arguments. Scheme comes in handy here: 

(define (h s) 

(g (ref s 0) (ref s 1))) 

(h (up ’x ’y) ) 

(g * y) 

((D g) ’x ’y) 

(down (( (partial 0) g) x y) (( (partial 1) g) x y) ) 

((D h) (up ’x ’y)) 

(down (( (partial 0) g) x y) (( (partial 1) g) x y) ) 


A phase-space state function is a function of time, coordinates, 
and momenta. Let H be such a function. The value of H is 
H(t, (x, y ) , \p x ,p y ]) for time t. coordinates (. x,y ), and momenta 
\p x ,Py\- Let s be the phase-space state tuple as in (B.6): 


S = (t, (X,y) , \Px,Py}) ■ 


(B.28) 
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The value of H for argument tuple s is H(s). We use both ways 
of writing the value of H. 

We often show a function of multiple arguments that include 
tuples by indicating the boundaries of the argument tuples with 
semicolons and separating their components with commas. If H 
is a function of phase-space states with arguments t, (x,y), and 
\p x ,Py\, we may write H(t-,x,y,p x ,p y ). This notation loses the 
up/down distinction, but our semicolon-and-comma notation is 
convenient and reasonably unambiguous. 

The derivative of If is a function that produces an object that 
can be contracted with an increment in the argument structure to 
produce an increment in the function’s value. The derivative is a 
down tuple of three partial derivatives. The first partial derivative 
is the partial derivative with respect to the numerical argument. 
The second partial derivative is a down tuple of partial derivatives 
with respect to each component of the up-tuple argument. The 
third partial derivative is an up tuple of partial derivatives with 
respect to each component of the down-tuple argument: 

DH(s) = [d 0 H(s),d 1 H(s),d 2 H(s)\ (B.29) 

= [d 0 H(s), [diflH(s),di,iH(s )] , (d 2fi H(s), d 2 pH(s ))} , 

where <9go indicates the partial derivative with respect to the first 
component (index 0) of the second argument (index 1) of the func- 
tion, and so on. Indeed, d z F = I z o DF for any function F and 
access chain z. So, if we let As be an incremental phase-space 
state tuple, 

As = (At, (Ax, Ay) , [A p x , A p y ]) , (B.30) 

then 

DH(s)A.s = d 0 H(s)At 

+ difiH(s)Ax + di j iH(s)Ay 
+ d 2fi H(s)Ap x + d 2t iH(s)Ap y . (B.31) 

Caution: Partial derivative operators with respect to different 
structured arguments generally do not commute. 

In Scheme we must make explicit choices. We usually assume 
that phase-space state functions are functions of the tuple. For 
example, 
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(define H 

(literal-function ’H 

(-> (UP Real (UP Real Real) (DOWN Real Real)) Real))) 
(H s) 

(H (up t (up x y) (down p.x p.y) ) ) 

((D H) s) 

( down 

(( (partial 0) H) (up t (up x y) (down p.x p.y))) 

(down (((partial 1 0) H) (up t (up x y) (down p.x p.y))) 
(( (partial 1 1) H) (up t (up x y) (down p.x p.y) ) ) ) 
(up (( (partial 2 0) H) (up t (up x y) (down p.x p.y))) 

(( (partial 2 1) H) (up t (up x y) (down p.x p.y))))) 


Structured Results 

Some functions produce structured outputs. A function whose 
output is a tuple is equivalent to a tuple of component functions 
each of which produces one component of the output tuple. 

For example, a function that takes one numerical argument and 
produces a structure of outputs may be used to describe a curve 
through space. The following function describes a helical path 
around the £-axis in 3-dimensional space: 

h(t) = (cost, sin t,t) = (cos, sin, I)(t). (B.32) 

The derivative is just the up tuple of the derivatives of each com- 
ponent of the function: 

Dh(t) = (—sin t, cost, 1). (B.33) 

In Scheme we can write 

(define (helix t) 

(up (cos t) (sin t) t)) 

or just 

(define helix (up cos sin identity)) 

Its derivative is just the up tuple of the derivatives of each com- 
ponent of the function: 


( (D helix) ’t) 

(up (* -1 (sin t) ) (cos t) 1) 
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In general, a function that produces structured outputs is just 
treated as a structure of functions, one for each of the components. 
The derivative of a function of structured inputs that produces 
structured outputs is an object that when contracted with an in- 
cremental input structure produces a linear approximation to the 
incremental output. Thus, if we define function g by 


g(x, y ) = ((x + y) 2 , {y - x) 3 , e x+y ), 


then the derivative of g is 


Dg{x,y) 


( 2 {x + y) \ 
-3 {y-x) 2 
V e x+y ) 


( 2 {x + y) \ 
3 (y ~ x) 2 
V e x+y ) 


(B.34) 


(B.35) 


In Scheme: 

(define (g x y) 

(up (square (+ x y) ) (cube (- y x) ) (exp (+ x y) ) ) ) 

((D g) ’x ’y) 

(down (up (+ (* 2 x) (* 2 y) ) 

(+ (* -3 (expt x 2)) (* 6 x y) (* -3 (expt y 2))) 

( * (exp y) (exp x) ) ) 

(up (+ (* 2 x) (* 2 y) ) 

(+ (* 3 (expt x 2)) (* -6 x y) (* 3 (expt y 2))) 

( * (exp y) (exp x) ) ) ) 

Exercise B.l: Chain Rule 

Let F(x,y) = x 2 y 3 , G(x,y) = {F(x,y),y), and H(x,y) = F(F(x,y),y), 
so that H = F o G. 

a. Compute do F(x,y) and d±F(x,y). 

b. Compute d 0 F(F(x, y),y) and d 1 F(F(x,y),y). 

c. Compute do G{x,y) and d\G{x,y). 

d. Compute DF(a 1 b ), DG{ 3,5) and DH(3a 2 ,5b 3 ). 

Exercise B.2: Computing Derivatives 

We can represent functions of multiple arguments as procedures in sev- 
eral ways, depending upon how we wish to use them. The simplest idea 
is to identify the procedure arguments with the function’s arguments. 

For example, we could write implementations of the functions that 
occur in exercise B.l as follows: 

(define (f x y) 

(* (square x) (cube y))) 
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(define (g x y) 

(up (f x y) y)) 

(define (h x y) 

(f (f x y) y)) 

With this choice it is awkward to compose a function that takes mul- 
tiple arguments, such as /, with a function that produces a tuple of 
those arguments, such as g. Alternatively, we can represent the function 
arguments as slots of a tuple data structure, and then composition with 
a function that produces such a data structure is easy. However, this 
choice requires the procedures to build and take apart structures. 

For example, we may define procedures that implement the functions 
above as follows: 

(define (f v) 

(let ( (x (ref v 0)) 

(y (ref v 1))) 

(* (square x) (cube y)))) 

(define (g v) 

(let ( (x (ref v 0)) 

(y (ref v 1))) 

(up (f v) y))) 

(define h (compose f g) ) 

Repeat exercise B.l using the computer. Explore both implementa- 
tions of multiple-argument functions. 





c 

Tensors 


There are a variety of objects that have meaning independent 
of any particular basis. Examples are form fields, vector fields, 
covariant derivative, and so on. We call objects that are inde- 
pendent of basis geometric objects. Some of these are functions 
that take other geometric objects, such as vector fields and form 
fields, as arguments and produce further geometric objects. We 
refer to such functions as geometric functions. We want the laws 
of physics to be independent of the coordinate systems. How we 
describe an experiment should not affect the result. If we use only 
geometric objects in our descriptions then this is automatic. 

A geometric function of vector fields and form fields that is 
linear in each argument with functions as multipliers is called a 
tensor. For example, let T be a geometric function of a vector field 
and form field that gives a real-number result at the manifold point 
m. Then 

T(fu +gv,cj) = fT(u,u>) + gT(v,u>) (C.l) 

T(u,fu; + g0) =fT(u,w) + gT(u,0), (C.2) 

where u and v are vector fields, u and 6 are form fields, and f and 
g are manifold functions. That a tensor is linear over functions 
and not just constants is important. 

The multilinearity over functions implies that the components 
of the tensor transform in a particularly simple way as the basis 
is changed. The components of a real-valued geometric function 
of vector fields and form fields are obtained by evaluating the 
function on a set of basis vectors and their dual form basis. In our 
example, 

Tj = T( e j,e*), (C.3) 

for basis vector fields ej and dual form fields e*. On the left, T* is 
a function of place (manifold point); on the right, T is a function 
of a vector field and a form field that returns a function of place. 
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Now we consider a change of basis, e(f) = e'(f) J or 

e*(f) = E e i( f )4, (°- 4 ) 

i 

where J typically depends on place. The corresponding dual basis 
transforms as 

S‘(v)=^K*e'>M, (0.5) 

j 

where K = J -1 or K*-(m)J^(m) = 5\. 

Because the tensor is multilinear over functions, we can deduce 
that the tensor components in the two bases are related by, in our 
example, 

t' k ;,t; / j '. (c.6) 

kl 

or 

T'' )Jj;,TfK'. (C.7) 

kl 

Tensors are a restricted set of mathematical objects that are 
geometric, so if we restict our descriptions to tensor expressions 
they are prima facie independent of the coordinates used to rep- 
resent them. So if we can represent the physical laws in terms of 
tensors we have built in the coordinate-system independence. 

Let’s test whether the geometric function R, which we have 
called the Riemann tensor (see equation 8.2), is indeed a tensor 
field. A real- valued geometric function is a tensor if it is linear 
(over the functions) in each of its arguments. We can try it for 
3-dimensional rectangular coordinates: 
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(let ((cs R3-rect)) 

(let ((u (literal-vector-field ’u-coord cs)) 

(v (literal-vector-field ’v-coord cs)) 

(w (literal-vector-field ’w-coord cs)) 

(x (literal-vector-field ’x-coord cs)) 

(omega (literal-lf orm-f ield ’omega-coord cs)) 

(mi (literal-lf orm-f ield ’nu-coord cs)) 

(f (literal -manifold-function ’f-coord cs)) 

(g (literal -manifold-function ’g-coord cs)) 

(nabla (covariant-derivative (literal-Cartan ’G cs))) 

(m (typical -point cs))) 

(let ((F (Riemann nabla))) 

((up (- (F (+ (* f omega) (* g nu)) u v w) 

(+ (* f (F omega u v w)) (* g (F nu u v w)))) 

(- (F omega (+ (* f u) (* g x)) v w) 

(+ (* f (F omega u v w)) (* g (F omega x v w)))) 

(- (F omega v (+ (* f u) (* g x)) w) 

(+ (* f (F omega v u w)) (* g (F omega v x w)))) 

(- (F omega v w (+ (* f u) (* g x))) 

(+ (* f (F omega v w u)) (* g (F omega v w x))))) 

m)))) 

(up 0 0 0 0) 

Now that we are convinced that the Riemann tensor is indeed a 
tensor, we know how its components change under a change of 
basis. Let 

Rjki = R(®*> e i)r (C.8) 

then 

rJm = E (c.9) 

mnpq 

or 

R %i = E 4 R ^ K ”K p k Kl (c.io) 

mnpq 


Whew! 

It is easy to generalize these formulas to tensors with general 
arguments. We have formulated the general tensor test as a pro- 
gram tensor-test that takes the procedure T to be tested, a list of 
argument types, and a coordinate system to be used. It tests each 
argument for linearity (over functions). If the function passed as 
T is a tensor, the result will be a list of zeros. 
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So, for example, Riemann proves to be a tensor 
(tensor-test 

(Riemann (covariant-derivative (literal-Cartan ’G R3-rect))) 
’(lform vector vector vector) 

R3-rect) 

( 0000 ) 

and so does the torsion (see equation 8.21): 

(tensor-test 

(torsion (covariant-derivative (literal-Cartan ’G R3-rect))) 
’(lform vector vector) 

R3-rect) 

(up 0 0 0) 

But not all geometric functions are tensors. The covariant deriva- 
tive is an interesting and important case. The function F, defined 
by 

F(u>, u, v) = cj(V u v), (C.ll) 

is a geometric object, since the result is independent of the coor- 
dinate system used to represent the V. For example: 

(define ((F nabla) omega u v) 

(omega ((nabla u) v))) 

(((- (F (covariant-derivative 
(Christoff el->Cart an 
(metric->Christ off el-2 
(coordinate-system->metric S2-spherical) 
(coordinate-system->basis S2-spherical) ) ) ) ) 

(F (covariant-derivative 
(Christoff el->Cart an 
(metric->Christ off el-2 

(coordinate-system->metric S2-stereographic) 
(coordinate-system->basis S2-stereographic) ) ) ) ) ) 
(literal-lf orm-f ield ’omega S2-spherical) 
(literal-vector-field ’u S2-spherical) 

(literal-vector-field ’v S2-spherical) ) 

((point S2-spherical) (up ’theta ’phi))) 

0 


But it is not a tensor field: 
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(tensor-test 

(F (covariant-derivative (literal-Cartan ’G R3-rect))) 

’ (lform vector vector) 

R3-rect) 

(0 0 MESS) 

This result tells us that the function F is linear in its first two 
arguments but not in its third argument. 

That the covariant derivative is not linear over functions in the 
second vector argument is easy to understand. The first vector 
argument takes derivatives of the coefficients of the second vector 
argument, so multiplying these coefficients by a manifold function 
changes the derivative. 
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Page numbers for Scheme procedure definitions are in italics. 
Page numbers followed by n indicate footnotes. 


o (composition), 196 
b, 133 

a, 134 

D (derivative R” -> R m ), 202 
d (partial derivative), 206 
df (differential of manifold 
function), 33, 35 
dw (exterior derivative of form 
field), 63 
8), 34 n 
V (nabla), 8 
7 *,(*), 29 
Ij (selector), 199 
C v (Lie derivative), 85 
A- calculus, 195 
A-expression, 186-187 
A-notation, 186 n 
#.( m ), 29 
zuj, 96 

’ (quote in Scheme), 192 
, in tuple, 206 
; in tuple, 206 
# in Scheme, 192 
] for down tuples, 198 
( ) for up tuples, 198 
0 in Scheme, 185, 186 n, 191 
4tuple->ct, 176 
4tuple->space, 176 

Abuse of notation, 17 n 
Access chain, 199 
Alternative in conditional, 189 


Ampere, 167 

Angular momentum, SO (3) and, 
53 (ex. 4.3) 

Arguments, in Scheme, 185 
Arithmetic 
generic, 195 
on functions, 196 
on operators, 203 
on symbolic values, 197 
on tuples, 195, 199-202 
Associativity and 

non-associativity of tuple 
multiplication, 201, 202 
Atlas, on manifold, 11 

Basis fields, 41-53 
change of basis, 44-47 
dual forms, 41 
general vector as linear 
combination, 41 
Jacobian, 45 
linear independence, 43 
over a map, 74 
rotation basis, 47-48 
basis->lf orm-basis, 45 
basis->vector-basis, 45 
Bianchi identity, 129-131 
first, 130 
second, 130 
Boost, 172 
general, 174 
perpendicular, 174 
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Boost ( continued !) 
transformation under rotation, 
177 

Brackets for down tuples, 198 
car, 191 

Cardioid, 18 (ex. 2.1) 

Cartan one-forms, 96 
Christoffel coefficients and, 98 
expression in terms of covariant 
derivative, 98 

linearity over functions, 97 
parallel transport and, 96 
transformation rule, 101 
Cartan’s formula for Lie 
derivative, 92 
Cartan-transf orm, 101 
cdr, 191 
Chain rule 

for derivatives, 203, 208 (ex. 
B.l) 

for partial derivatives, 205, 208 
(ex. B.l) 

for vector fields, 26 
chart, 7, 14 
Chart, on manifold, 11 
Christoffel coefficients, 3, 8, 98 
first kind, 135 
from metric, 135 
Lagrange equations and, 3, 138 
metric and, 9 
second kind, 136 
Christoff el->Cartan, 9, 102 n 
Church, Alonzo, 186 n 
circular, 31 

Circular orbit in Schwarzschild 
spacetime, 148 (ex. 9.7) 
stability of, 148 (ex. 9.8) 

Closed form field, 64 
Coefficient functions 
one- form field, 34 
vector field, 23 
Combinatory logic, 77 n 
Comma in tuple, 206 
Commutativity of some tuple 
multiplication, 201 
Commutator, 48 
for rotation basis, 50-51 


meaning, 52 

non-coordinate basis, 53 (ex. 
4.2) 

zero for coordinate basis fields, 
48 

component, 200 

Components of the velocity, 76 

components->lf orm-f ield, 35 
components->vector-f ield, 24 
compose, 188 
Composition 

of functions, 196, 209 (ex. B.2) 
of linear transformations, 202 
of operators, 203 
Compound data in Scheme, 
190-192 
cond, 188 

Conditionals in Scheme, 188-189 
Confusion: differential of map 
and of manifold function, 73 
Connection 
Cartan one-forms, 96 
Christoffel coefficients and, 98 
from Lagrange equations, 137 
from metric, 135 
cons, 191 

Consequent in conditional, 188 
Constraint, 3, 112 
Constructors in Scheme, 190 
contract, 135 
Contraction of tuples, 200 
Contravariant, 29 
Coordinate basis 
one- form held, 34 
one-form held traditional 
notation, 36 
vector held, 26-29 
vector held traditional notation, 
27 

Coordinate component functions 
one- form held, 34 
vector held, 23 
Coordinate function, 11 -14 
Coordinate independence, 38 
integration, 55 
manifold functions, 15 
one-form held, 38 
vector held, 22, 28 
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Coordinate patch, on manifold, 

11 

Coordinate representation 
manifold functions, 14 
one-form fields, 35 
vector fields, 25 
Coordinate transformations 
one-form field, 38 
vector field, 28 
coordinate-system, 13 
coordinate-system-at, 13 n 
coordinate-system- >basis, 9, 

46 

coordinatize, 25 

Cosmology, 149 (ex. 9.9), 150 
(ex. 9.10) 

Coulomb, 167 
Covariant, 28 

Covariant derivative, 8, 93-104 
change of basis, 100 
directional derivative and, 93 
generalized product rule, 99 
geodesic, 8 

Lie derivative and, 113 (ex. 7.2) 
nabla (V) notation, 8 
not a tensor, 214 
of one-form field, 99 
of vector field, 93 
over a map, 106 
parallel transport and, 93 
product rule, 97 
covariant-derivative, 9 
covariant-derivative-f orm, 

100 

covariant-derivative-vector, 

97 

curl, 154 
curl, 155 

Curry, Haskell, 77 n 
Currying, 77 
Curvature, 115-131 
by explicit transport, 116 
intrinsic, 115 

pseudosphere, 123 (ex. 8.2), 144 
(ex. 9.4) 

Riemann curvature operator 
and, 115 


Schwarzschild spacetime, 147 
(ex. 9.6) 

spherical surface, 123 (ex. 8.1), 
143 (ex. 9.3) 
universe, 150 (ex. 9.10) 

dw (exterior derivative of form 
field), 63 

D, D (derivative R" — > R m ), 

202 

define, 187 

def ine-coordinates, 16, 27, 36 
Definitions in Scheme, 187 188 
Derivative, 202-207, See also 
Covariant derivative; 
Directional derivative; 

Exterior derivative; Lie 
derivative; Partial derivative; 
Vector field 

as best linear approximation, 21 
as operator, 203 
chain rule, 203, 208 (ex. B.l) 
in Scheme programs: D, 202 
notation: D, 202 
of function of multiple 
arguments, 204-207 
of function with structured 
inputs and outputs, 208 
precedence of, 202 
Determinant, 57 n, 61 
df (differential of manifold 
function), 33, 35 
Difference between points on 
manifold, 21 
Differential 

in a coordinate basis, 35 
of manifold function (df), 33 
of map, 72 
pullback and, 77 
differential, 10, 12 
Differential equation, integral 
curve of, 29 

Dimension of manifold, 12 
Directional derivative, 83-114 
all agree on functions, 83 
covariant derivative, 93 
extended Leibniz rule, 85 
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Directional derivative ( continued ) 
formulation of method of 
transport, 83 
general properties, 84 
Leibniz rule, 84 
Lie derivative, 85 
of form field, 83 
of manifold function, 22 
of vector field, 22, 83 
using ordinary derivative, 84 
div, 154 

divergence, 156 
Divergence Theorem, 67 
down, 199 
Down tuple, 198 
drop2, 147 n 
Dual basis, 42 
over a map, 74 

Dual forms used to determine 
vector field coefficients, 43 
Duality, 41 
Hodge, 153 
illustrated, 36 
one-form held, 34 
Dust stress-energy tensor, 147 

Einstein 

held equations, 145 
special relativity, 167 
tensor, 145 n 
Einstein, 149 (ex. 9.9) 
Einstein-f ield-equation, 149 
(ex. 9.9) 

Electrodynamics, 160-170 
else, 188 
Empty list, 191 
eq?, 193 
Euler angles, 47 
alternate angles, 52 (ex. 4.1) 
Euler-Lagrange equations, 2 
Event, 179 
Evolution 

Hamiltonian, 113 (ex. 7.1) 
integral curve, 30 
Evolution operator, 31 
Exact form held, 64 
Exact forms are closed, 64 


Exponentiating Lie derivatives, 
91 

Expressions in Scheme, 185 
Extended rotation, 177 
Exterior derivative, 33 n, 62-65 
Cartan’s formula and, 92 
commutes with Lie derivative, 
90 

commutes with pullback, 80 
coordinate-system independent, 
62 

general definition, 62 
graded formula, 69 (ex. 5.2) 
iterated, 69 (ex. 5.3) 
obeys graded Leibniz rule, 64 
of one-form, 62 
Stokes’s Theorem and, 62 

F->C, 4 
Faraday, 167 
Faraday, 160 
Faraday tensor, 160 
Field equations 
Einstein, 145 
Maxwell, 162 

FLRW-metric, 150 (ex. 9.9) 

Force, Lorentz, 164 
relativistic, 166 (ex. 10.1) 

Form held 
closed, 64 
exact, 64 
pushforward, 81 

f orm-f ield->f orm-f ield-over- 
map, 74 

Formal parameters of a 
procedure, 187 
Franklin, 167 

Friedmann metric, 150 (ex. 9.9) 
Friedmann-Lemaitre- Robert son- 
Walker, 149 (ex. 9.9) 
Function(s), 196-197 
arithmetic operations on, 196 
composition of, 196, 209 (ex. 
B.2) 

operators vs., 203 
selector, 199 
tuple of, 207 
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vs. value when applied, 195, 196 
with multiple arguments, 204, 
205, 208 (ex. B.2) 
with structured arguments, 205, 
208 (ex. B.2) 

with structured output, 207, 

208 (ex. B.2) 

Functional mathematical 
notation, 195 
Fundamental Theorem of 
Calculus, 66 

g-Minkowski, 159 
Galileo Galilei, 1 
Gauss, 167 

General relativity, 144- 151, 172 
general-boost, 176 
general-boost2, 177 
Generic arithmetic, 195 
Geodesic deviation, 125-129 
relative acceleration of 
neighboring geodesics and, 125 
Riemann curvature and, 126 
vector field, 126 
Geodesic motion, 2, 111 
governing equation, 8, 111 
governing equation, in 
coordinates, 9, 111 
Lagrange equations and, 2, 112 
SO(3) on, 143 (ex. 9.2) 
Geometric function, 211 
Geometric object, 211 
Global Lorentz frame, 172 
grad, 154 
gradient, 154 
Green’s Theorem, 67 

Hamiltonian Evolution, 113 (ex. 

.7.1) 

Higher-rank forms, 57 
linear and antisymmetric, 57 
Hill climbing, power expended, 

39 (ex. 3.3) 

Hodge dual, 153 
Hodge star, 153-158 
Honest definition rare, 79 

Ij (selector), 199 
if, 188 


Inner product of tuples, 201 
Integral 

coordinate- independent 
definition, 58 
higher dimensions, 57- 58 
higher-rank forms, 57 
one-form, 56 
Integral curve, 29-32 
differential equation, 29 
evolution, 30 
Taylor series, 30 
Integration, 55-69 
coordinate-independent notion 
of, 55 

Interior product, 92 
Iteration in Scheme, 190 

Jacobian, 28 
basis fields, 45 

Knuth, Donald E., 219 
Kronecker delta, 34 n 

C v (Lie derivative), 85 
Lagrange equations, 2 n 
Christoffel coefficients and, 138 
geodesic equations and, 137 
Lagrange-explicit, 138 n 
Lagrangian, 2 
constraint, 3 
free particle, 3, 137 
metric and, 137 
lambda, 186 
Lambda calculus, 195 
Lambda expression, 186-187 
Laplacian, 156 
wave equation and, 160 
Laplacian, 156 
Leibniz rule (product rule) for 
vector field, 26 

Lemniscate of Bernoulli, 18 (ex. 

2 . 1 ) 

let, 189 

Lie derivative, 85-93 
alternate formulation, 88 
as commutator, 86 
commutes with exterior 
derivative, 90 
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Lie derivative ( continued) 
covariant derivative and, 113 
(ex. 7.2) 

directional derivative and, 85 
exponentiation, 91 
interpretation, 87 
of form field, 89 
of function, 85 
of vector field, 85 
properties, 89 
transport operator is 
pushforward, 85 
uniform interpretation, 89 
Lie-derivative-vector, 87 
Linear independence and basis 
fields, 43 

Linear transformations as tuples, 
201 

Linearity 
one-form field, 33 
vector field, 26 
Lisp, 191 n 
list, 191 
list-ref, 191 
Lists in Scheme, 190-192 
Literal symbol in Scheme, 

192- 193 

literal-lf orm-f ield, 35 
literal-Christoff el-2, 119 n 
literal-function, 16, 197, 206 
literal-manifold-function, 

16 n 

literal-manifold-map, 7 n 
literal-metric, 6 n 
literal-vector-field, 24 

Local names in Scheme, 189-190 
Loops in Scheme, 190 
Lorentz, 169 

Lorentz decomposition, 179 (ex. 

11 . 1 ) 

Lorentz force, 164 
relativistic, 166 (ex. 10.1) 
Lorentz frame, global, 172 
Lorentz interval, 172 
Lorentz transformation, 172 
general, 178 
simple, 173 

unique decomposition, 178 


lower, 134 

Lowering a vector field, 133 

make-4tuple, 175 
make-fake-vector-field, 74 
make-manifold, 12 
make-SR-coordinates, 179 n 
make-SR-frame, 180 
Manifold, 11-19 
atlas, 11 
chart, 11 

coordinate function, 11 14 
coordinate patch, 11 
coordinate-independence of 
manifold functions, 15 
difference between points, 21 
dimension of, 12 
motion and paths, 71 
naming coordinate functions, 16 
Manifold function, 14-17 
coordinate representation, 14 
directional derivative, 22 
Matrix as tuple, 201 
Maxwell, 167 
Maxwell, 161 
Maxwell tensor, 161 
Maxwell’s equations, 162 
Metric, 5, 133-151 
Christoffel coefficients and, 9, 
135 

connection compatible with, 135 
Friedmann, 150 (ex. 9.9) 
Lagrangian and, 137 
Minkowski, 133, 159 
metric->Christoff el-2, 9 
metric->Lagrangian, 138 
Minkowski metric, 133, 159 
Motion 

geodesic in General Relativity, 
144 

on a sphere, 10 (ex. 1.1) 
on manifolds, 71 
Schwarzschild spacetime, 148 
(ex. 9.7), 148 (ex. 9.8) 
Multiplication of operators as 
composition, 203 
Multiplication of tuples, 200-202 
as composition, 202 
as contraction, 200 
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Nabla (V), 8 
Newton-connection, 1^6 
Newton-metric, 1^6 
Newton’s equations, 32 (ex. 3.1) 
Non-associativity and 
associativity of tuple 
multiplication, 201, 202 
Non-commutativity 
of some partial derivatives, 206 
of some tuple multiplication, 
202 

Non-coordinate basis, 41, 47 
commutator, 53 (ex. 4.2) 
Notation, 195-209 
V, 8 

( ) for up tuples, 198 
] for down tuples, 198 
abuse of, 17 n 

ambiguities of traditional, 195 
derivative, partial: d, 206 
derivative: D } 202 
functional, 195 
selector function: Ij, 199 
Notation for coordinate basis 
one-form field, 36 
vector field, 27 

Oersted, 167 
One-form field, 32-39 
coefficient functions, 34 
coordinate basis, 34 
coordinate independence, 38 
coordinate representation, 35 
coordinate transformations, 38 
differential, 33 
duality, 34 
general, 34 
linearity, 33 

not all are differentials, 37 
over a map, 71, 73 
raising, 134 
Operator, 203 

arithmetic operations on, 203 
function vs., 203 

Oriented area of a parallelogram, 
59 

Over a map, 71-81 
basis fields, 74 


covariant derivative, 106 
dual basis, 74 
one- form field, 73 
vector field, 71 

pair?, 192 
Pairs in Scheme, 191 
Parallel transport, 3, 93, 104-112 
Cartan one-forms and, 96 
equations similar to variational 
equations, 108 
for arbitrary paths, 104 
geodesic motion and, 111 
governing equations, 106 
independent of rate of 
transport, 94 

numerical integration, 109 
on a sphere, 106 
path-dependent, 93 
Parentheses 
in Scheme, 185, 186 n 
for up tuples, 198 
Partial derivative, 204-206 
chain rule, 205, 208 (ex. B.l) 
commutativity of, 37 
notation: d, 206 
patch, 12 

Patch, on manifold, 11 
Paths and manifolds, 71 
Perfect-fluid stress-energy tensor, 
150 (ex. 9.9) 

Phase-space state function, 205 
in Scheme, 206 
Poincare transformation, 172 
point, 7, 14 

Power expended in hill climbing, 
39 (ex. 3.3) 

Predicate in conditional, 188 
print-expression, 31, 197 n 
Procedure calls, 185-186 
procedure->vector-f ield, 24 
Product rule (Leibniz rule) for 
vector field, 26 

Projection, stereographic, 19 (ex. 

2.2) 

Proper length, 159 
proper-space-interval, 176 
Proper time, 159 
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proper-time-interval, 176 
Pullback 

commutes with exterior 
derivative, 80 
differential and, 77 
of form field, 79 
of function, 76 
of vector field, 79 
properties, 80 

vector field over a map as, 77 
pullback-form, 80 
pullback-function, 77 
pullback-vector-field, 79 
Pushforward 
along integral curves, 78 
of form field, 81 
of function, 76 
of vector field, 78 
pushf orward-vector, 78 

Quotation in Scheme, 192 193 

R x , R z (rotations), 47 

R2-polar, 7 n 
R2-polar-Cartan, 102 
R2->R, 16 
R2-rect, 7 n, 13 n 
R2-rect-Cartan, 102 
R2-rect-Christoff el, 102 
R2-rect-point, 16 
R3-cyl, 18 (ex. 2.1) 
raise, 135 

Raising a one-form field, 134 
Recursive procedures, 189 
ref, 200 

Reparameterization, 141 
Residual, xvi 
Restricted vector field, 71 
Ricci, 123 

Ricci scalar, 144 (ex. 9.3) 

Ricci tensor, 123 
Riemann. 116 
Riemann curvature 
in terms of Cartan one-forms, 
120 

way to compute, 120 
Riemann-curvature, 115 
Riemann curvature operator, 115 


Riemann tensor, 115 
by explicit transport, 116 
for sphere, 116 
is a tensor, 212 

Robertson- Walker equations, 150 
(ex. 9.9) 

Rotation 

extended to Lorentz 
transformation, 177 
as tuples, 201 

S2-basis, 75 
S2-Christoff el, 107 
S2-spherical, 75 
s :map/r, 45 
Salam, Abdus, 195 
Scheme, 185-193, 195 
lists, 191 

quotation, 192-193 
vectors, 192 

Schonfinkel, Moses, 77 n 
Schwarzschild-metric, 1^8 (ex. 
9.6) 

Schwarzscliild spacetime 
circular orbit, 148 (ex. 9.7) 
circular orbit stability, 148 (ex. 
9.8) 

curvature, 147 (ex. 9.6) 
Scmutils, 195-209 
generic arithmetic, 195 
simplification of expressions, 

197 

Selector functions, 190, 199 
Semicolon in tuple, 206 
series : for-each, 31 
Simplification of expressions, 197 
SO(3), 47 

angular momentum and, 53 (ex. 
4.3) 

geodesics, 143 (ex. 9.2) 
S03-metric, 1^3 (ex. 9.2) 
Spacelike, 159 
Spacetime, 144 
Special orthogonal 
group— SO(3), 47 
Special relativity, 167 184 
electrodynamics, 160-170 
frame, 179 




Index 


22 7 


Minkowski metric, 159 
twin paradox, 181 
velocity addition, 180 

sphere-Cartan, 107 
sphere->R3, 4 

spherical-metric, 143 (ex. 9.3) 
Spherical surface, curvature, 143 
(ex. 9.3) 

Spivak, Michael, 195 
on notation, 28 n 
square, 187 
for tuples, 187 n 
Stability of circular orbits in 
Schwarzschild spacetime, 148 
(ex. 9.8) 

State derivative, 32 (ex. 3.1) 
Stereographic projection, 18 (ex. 
2.2) 

Stokes’s Theorem, 65-66 
proof, 65 
states, 65 

Stress-energy tensor, 145 
dust, 147 

perfect fluid, 150 (ex. 9.9) 
Structure constants, 50, 125 
Subscripts 

for down-tuple components, 198 
for selectors, 199 
Superscripts 

for up-tuple components, 12, 

198 

Symbolic values, 197 198 
Symbols in Scheme, 192- 193 
Syntactic sugar, 187 

Taylor series of integral curve, 30 
Tdust, 147 
Tensor, 211 -215 
components, 211 
linear over functions, 211 
stress-energy, 145 
transformation with change of 
basis, 211 

Tensor arithmetic vs. tuple 
arithmetic, 195, 199 
tensor-test, 213 
Timelike, 159 


Torsion, 124-125 
components in a non-coordinate 
basis, 125 
is a tensor, 214 
torsion, 124 

Tperf ect-f luid, 150 (ex. 9.9) 
trace2down, 144 ( ex - 9.3) 
Traditional notation 
coordinate-basis one-form field, 
36 

coordinate-basis vector field, 27 
Tuples, 198-202 
arithmetic on, 195, 199-202 
commas and semicolons in, 206 
component selector: Ij, 199 
composition and, 202 
contraction, 200 
down and up, 198 
of functions, 207 
inner product, 201 
linear transformations as, 201 
matrices as, 201 
multiplication of, 200-202 
rotations as, 201 
squaring, 187 n, 200 
up and down, 198 
Twin paradox, 181- 184 

up, 199 

Up tuple, 12, 198 

Vector, in Scheme, 190-192 
vector, 192 
vector?, 192 
vector-ref, 192 
vector-basis->dual, 44 
Vector calculus, 154-158 
Vector field, 21-32 
as linear combination of partial 
derivatives, 23 
chain rule, 26 
coefficient functions, 23 
commutator of, 48 
coordinate basis, 26-29 
coordinate independence, 22, 28 
coordinate representation, 25 
coordinate transformations, 28 
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Vector field ( continued ) 
directional derivative, 22 
Leibniz rule, 26 
linearity, 26 
lowering, 133 
module, 25 
operator, 24 
over a map, 71 
product rule, 26 
properties, 25-26 
restricted, 71 
traditional notation for 
coordinate-basis vector fields, 
27 

Vector field over a map, 71-73 
as pullback, 77 

vector-f ield->vector-f ield- 
over-map, 72 

Vector integral theorems, 67-69 
Divergence Theorem, 67 
Green’s Theorem, 67 
Vector space of tuples, 200 
Velocity 

addition of in special relativity, 
180 

at a time, 73 
components, 76 
differential and, 73 
on a globe, 81 (ex. 6.1) 

Volume, 59 

Walking on a sphere, 75 
Wave equation, 159 
Laplacian and, 160 
Wedge product, 58 
antisymmetry, 61 
associativity of, 59 
construction of antisymmetric 
higher-rank forms, 58 
determinant and, 61 
oriented area of a 
parallelogram, 59 

Zero-based indexing, 191, 198, 
200 




